Utterance verification (UV) is a process by which the output of a speech recognizer is verified to determine if the input speech actually includes the recognized keyword(s). The output of the speech verifier is a binary decision to accept or reject the recognized utterance based on a UV confidence score. In this paper, we extend the notion of utterance verification to not only detect errors but also selectively correct them. We perform error correction by flipping the hypotheses produced by an N-best recognizer in cases when the top candidate has a UV confidence score that is lower than that of the next candidate. We propose two measures for computing confidence scores and investigate the use of a hybrid confidence measure that combines the two measures into a single score. Using this hybrid confidence measure and an N-best algorithm, we obtained an 11% improvement in word-error rate on a connected digit recognition task. This improvement was achieved while still maintaining reliable detection of nonkeyword speech and misrecognitions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.