2006
DOI: 10.1016/j.csl.2005.10.001
|View full text |Cite
|
Sign up to set email alerts
|

Morphology-based language modeling for conversational Arabic speech recognition

Abstract: Language modeling for large-vocabulary conversational Arabic speech recognition is faced with the problem of the complex morphology of Arabic, which increases the perplexity and out-of-vocabulary rate. This problem is compounded by the enormous dialectal variability and differences between spoken and written language. In this paper we investigate improvements in Arabic language modeling by developing various morphology-based language models. We present four different approaches to morphology-based language mod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
48
0

Year Published

2010
2010
2022
2022

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 78 publications
(49 citation statements)
references
References 27 publications
1
48
0
Order By: Relevance
“…Prior research on applications of morphological analyzers has focused on machine translation, language modeling and speech recognition (Habash, 2008;Chahuneau et al, 2013a;Kirchhoff et al, 2006). Morphological analysis enables us to link together multiple inflections of the same root, thereby alleviating word sparsity common in morphologically rich languages.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Prior research on applications of morphological analyzers has focused on machine translation, language modeling and speech recognition (Habash, 2008;Chahuneau et al, 2013a;Kirchhoff et al, 2006). Morphological analysis enables us to link together multiple inflections of the same root, thereby alleviating word sparsity common in morphologically rich languages.…”
Section: Related Workmentioning
confidence: 99%
“…Recent research has demonstrated that adding information about word structure increases the quality of translation systems and alleviates sparsity in language modeling (Chahuneau et al, 2013b;Habash, 2008;Kirchhoff et al, 2006;Stallard et al, 2012).…”
Section: Introductionmentioning
confidence: 99%
“…The final score for each hypothesis can be computed as a log-linear combination of the invoked scores. The weights of this combination can be optimized to minimize the WER [8]. For the weight optimization, we use "Amoeba" search which is available in SRILM toolkit [14].…”
Section: Score Combinationmentioning
confidence: 99%
“…The features can be generated based on linguistic methods as in [6], or via data driven approaches as in [7]. Possible approaches for incorporating word features into LMs are: stream-based LMs [8], class-based LMs [9] and factored LMs [10]. In stream-based LMs, a normal back-off N-gram model is built over a stream of word classes, where the stream consists of sequences of a single class type called class stream.…”
Section: Introductionmentioning
confidence: 99%
“…17, SI-2000, Maribor, Slovenia model (FLM). Such models were first proposed for speech recognition in Arabic languages [3], but they have also been adopted in statistical machine translation [4] and, more recently, in natural language generation [5].…”
Section: Introductionmentioning
confidence: 99%