Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181
DOI: 10.1109/icassp.1998.674396
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of part-of-speech and automatically derived category-based language models for speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0
2

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 43 publications
(27 citation statements)
references
References 10 publications
0
25
0
2
Order By: Relevance
“…One approach to balance the specific, but poorer coverage word-based N-gram LMs with a more generic LM is to linearly interpolate the LM probabilities. This is commonly used with class-based LMs [17] and is used in this paper with paraphrastic LMs. Let P (w|h) denote the interpolated LM probability for any in-vocabulary wordw following an arbitrary historyh, this is given by…”
Section: Paraphrastic Language Modelsmentioning
confidence: 99%
“…One approach to balance the specific, but poorer coverage word-based N-gram LMs with a more generic LM is to linearly interpolate the LM probabilities. This is commonly used with class-based LMs [17] and is used in this paper with paraphrastic LMs. Let P (w|h) denote the interpolated LM probability for any in-vocabulary wordw following an arbitrary historyh, this is given by…”
Section: Paraphrastic Language Modelsmentioning
confidence: 99%
“…Data-driven approaches cluster words to minimize the overall perplexity of the corpus by a greedy approach [21,75]. It has been shown that data-driven approaches outperform classes based on POS tags [102].…”
Section: Class-based Lmsmentioning
confidence: 99%
“…Experiments presented by Niesler et al (1998) show that a class-based approach utilizing POS tagging is not as efficient as application of categories based on stochastic properties of n-grams occurring in the corpus. In the case of the problem being considered here, we cannot apply the latter approach to OOC words because they are not present in the corpus.…”
Section: Combining the Lm With A Flat Word Listmentioning
confidence: 99%
“…As another option, the simpler tool Morfeusz described by Woliński (2006) can be applied to find grammatical categories of isolated words. The class-based model LM CB is then created using the standard method described by Niesler et al (1998). The standard word n-gram model LM W is also created using the same corpus.…”
Section: Combining the Lm With A Flat Word Listmentioning
confidence: 99%