Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96
DOI: 10.1109/icslp.1996.607814
|View full text |Cite
|
Sign up to set email alerts
|

Compound words in large-vocabulary German speech recognition systems

Abstract: This paper analyzes the impact of German compound words on speech recognition. It is well known that, due to an idiosyncrasy of German orthography, compound words make up a major fraction of German vocabulary. And most OutOf-Vocabulary (OOV) compounds are composed of frequent words already in the lexicon. This paper introduces a new method for handling the components of compounds rather than the compounds themselves. This not only reduces the vocabulary, and therefore the perplexity, but also improves word acc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(17 citation statements)
references
References 2 publications
0
17
0
Order By: Relevance
“…In Yang et al [1998], the elements of the lexicon can be any "segment patterns" extracted from the training corpus with the goal of minimizing the overall perplexity. The same perplexity-based metric is also used by Giachin [1995] and Berton et al [1996] to add and remove lexicon items. However, in practice, finding an optimal lexicon on the basis of perplexity estimates is very computationally expensive.…”
Section: Lexicon Construction From Corpusmentioning
confidence: 99%
“…In Yang et al [1998], the elements of the lexicon can be any "segment patterns" extracted from the training corpus with the goal of minimizing the overall perplexity. The same perplexity-based metric is also used by Giachin [1995] and Berton et al [1996] to add and remove lexicon items. However, in practice, finding an optimal lexicon on the basis of perplexity estimates is very computationally expensive.…”
Section: Lexicon Construction From Corpusmentioning
confidence: 99%
“…They modeled the data-driven variants in both the dictionary and the language model. Berton et al (1996) introduced a method for handling the components of compounds rather than the compounds themselves. They proposed a method to decompose the OOV compound words to words already in the lexicon.…”
Section: Literature Reviewmentioning
confidence: 99%
“…An example of this is Berton, Fetter, and Regel-Brietzmann (1996), who extended the word graphs output by a German speech recognizer with possible compounds by combining edges of words during a lexical search. The final hypotheses were then identified from the graph using dynamic programming techniques.…”
Section: Related Workmentioning
confidence: 99%