2004
DOI: 10.1145/1034780.1034782
|View full text |Cite
|
Sign up to set email alerts
|

Lexical triggers and latent semantic analysis for cross-lingual language model adaptation

Abstract: In-domain texts for estimating statistical language models are not easily found for most languages of the world. We present two techniques to take advantage of in-domain text resources in other languages. First, we extend the notion of lexical triggers, which have been used monolingually for language model adaptation, to the cross-lingual problem, permitting the construction of sharper language models for a target-language document by drawing statistics from related documents in a resource-rich language. Next,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2007
2007
2018
2018

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(6 citation statements)
references
References 10 publications
0
6
0
Order By: Relevance
“…Cross-language information retrieval search results in languages differing from the query has also received attention (Rehder and et al 1998) as has LSA's use in language modeling (Kim and Khudanpur, 2004).…”
Section: Literature Reviewmentioning
confidence: 99%
“…Cross-language information retrieval search results in languages differing from the query has also received attention (Rehder and et al 1998) as has LSA's use in language modeling (Kim and Khudanpur, 2004).…”
Section: Literature Reviewmentioning
confidence: 99%
“…Yet, they do not use a unified framework for both tasks. Some researchers cast uncertainty detection as a token sequence labeling problem, and then use hand-crafted rules to extract scopes [15,44,48,49]. Others define both tasks as token sequence labeling problems; yet use separate feature sets for each tasks or do not use the output of one task to inform the other [50].…”
Section: Related Workmentioning
confidence: 99%
“…L T -words) that are topically related to s. For example, given that we sampled ('bacteria') from the L S -document, we are very likely to sample words like ('bacteria') or ('disease') from the L T -document. 7 This is similar in spirit to the idea of lexical triggers [48]. We can use a baseline scoring function S raw (as defined in Section 3.1) and define the trigger probability P CC (t|s)…”
Section: Incorporating Information From An Auxiliary Languagementioning
confidence: 99%
“…Inter-lingual triggers have been also used in [29] to enrich resource deficient languages from those which are considered as potentially important. An inter-lingual trigger is henceforth a set made up of a word (or a sequence of words) f in a source language, and its best correlated words in a target language e 1 , e 2 , .…”
Section: Inter-lingual Triggersmentioning
confidence: 99%