IEEE International Conference on Acoustics Speech and Signal Processing 2002
DOI: 10.1109/icassp.2002.1005854
|View full text |Cite
|
Sign up to set email alerts
|

Language model adaptation through topic decomposition and MDI estimation

Abstract: This work presents a language model adaptation method combining the latent semantic analysis framework with the minimum discrimination information estimation criterion. In particular, an unsupervised topic model decomposition is built which allows to infer topic related word distributions from very short adaptation texts. The resulting word distribution is then used to constraint the estimation of a minimum divergence trigram language. With respect to previous work, implementation details are discussed that ma… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
9
0

Year Published

2008
2008
2014
2014

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 12 publications
(10 citation statements)
references
References 5 publications
1
9
0
Order By: Relevance
“…In many works [5,7,10], α(w) is exponentially smoothed by a coefficient lower than 1, optimized on heldout data. However, in our experiments, we chose to use (10) as it is, since this paper does not seek to perfectly tune a LM adaptation but rather aims at better understanding mechanisms that are useful for topic adaptation.…”
Section: Minimum Discriminant Information Language Model Adaptationmentioning
confidence: 99%
See 2 more Smart Citations
“…In many works [5,7,10], α(w) is exponentially smoothed by a coefficient lower than 1, optimized on heldout data. However, in our experiments, we chose to use (10) as it is, since this paper does not seek to perfectly tune a LM adaptation but rather aims at better understanding mechanisms that are useful for topic adaptation.…”
Section: Minimum Discriminant Information Language Model Adaptationmentioning
confidence: 99%
“…where n is an empirically set parameter. As it can be shown from (5) any topic-specific word reduces to 1, i.e., their probability is directly reported from the baseline LM except the normalization factor. Figure 1 presents word error rate (WER) and perplexity variations measured on our development set using either topic terminologies of different sizes or using the whole vocabulary.…”
Section: Feature Selectionmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, in [3], it is used for efficiently calculating the relative entropy when n-gram parameter is pruned. In [4], it is used to calculate the normalization parameters for MDI estimation. In this paper, it is used for efficient LMLA probabilities generation.…”
Section: The Data Sparseness Of N-gram Modelmentioning
confidence: 99%
“…One of the focuses of future work is integrating fast marginal adaptation directly into the decoder. An efficient implementation has been described in [20]. Also, we wish to replace current manual sentence and story segmentation with an automatic segmentation system.…”
Section: Discussionmentioning
confidence: 99%