1977
DOI: 10.1121/1.2016299
|View full text |Cite
|
Sign up to set email alerts
|

Perplexity—a measure of the difficulty of speech recognition tasks

Abstract: Using counterexamples, we show that vocabulary size and static and dynamic branching factors are all inadequate as measures of speech recognition complexity of finite state grammars. Information theoretic arguments show that perplexity (the logarithm of which is the familiar entropy) is a more appropriate measure of equivalent choice. It too has certain weaknesses which we discuss. We show that perplexity can also be applied to languages having no obvious statistical description, since an entropy-maximizing pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
110
0
1

Year Published

1989
1989
2017
2017

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 226 publications
(112 citation statements)
references
References 0 publications
1
110
0
1
Order By: Relevance
“…5) containing each trigram at least once. The perplexity, 2 (where is entropy [22]), of this vocabulary is 5.53 per digram. By comparison, the perplexity of an equivalent model built from the 1,000 most common English words is 10.31.…”
Section: Resultsmentioning
confidence: 99%
“…5) containing each trigram at least once. The perplexity, 2 (where is entropy [22]), of this vocabulary is 5.53 per digram. By comparison, the perplexity of an equivalent model built from the 1,000 most common English words is 10.31.…”
Section: Resultsmentioning
confidence: 99%
“…, w m ) À1/m , is to maximize probability, because PPL is the inverse probability of the test set, normalized by the number of words. That is to say, a lower PPL indicates a better model [34]. If a language model built from the augmented corpus shows improved perplexity for the test set, it indicates the usefulness of our approach for corpus expansion.…”
Section: Discussionmentioning
confidence: 99%
“…One of the most widely used speech recognition difficulty measures at present is the so-called perplexity, which was introduced by Jelinek in 1977 [5]. Let W be a word sequence, w1; w2 ; 1 1 1 ; wn; allowed by the grammar; perplexity is then defined as P P = P (w1; 1 1 1 ; wn)…”
Section: Difficulty Measures For Speech Recognitionmentioning
confidence: 99%