Evaluating a probabilistic model for cross-lingual information retrieval

Xu, Jinxi; Weischedel, Ralph; Nguyen, Chanh

doi:10.1145/383952.383968

Cited by 91 publications

(125 citation statements)

References 15 publications

Supporting

Mentioning

120

Contrasting

Order By: Relevance

“…For example, the Arabic word "£ Y ¤¥ " can be translated as "bread" or "bake," and equation (1) would (with proper stemming) reward an occurrence of "baking bread." Corpus-based approaches to CLIR have generally developed within a framework based on language modeling rather than vector space models, at least in part because modern statistical translation frameworks offer a natural way of integrating translation and language models [18]. In general, language modeling approaches to retrieval rely on collection frequency (CF) in place of DF: 2 …”

Section: Introductionmentioning

confidence: 99%

Probabilistic Structured Query Methods

Darwish¹,

Oard²

2003

View full text Add to dashboard Cite

Section: Introductionmentioning

confidence: 99%

Probabilistic Structured Query Methods

Darwish¹,

Oard²

2003

View full text Add to dashboard Cite

“…(Hiemstra and de Jong, 1999;Berger and Lafferty, 1999;Xu, et al, 2001;Federico and Bertoldi, 2002). Our crosslingual work is based on a widely-used extension of the above monolingual model (Xu et al, 2001). To generate a query from a document in a different language, say Arabic, one samples either the Arabic document, or the English background model.…”

Section: Language Modelsmentioning

confidence: 99%

“…The cross-language relevance model is given in Equation (12), which is the same as Equation (10), above, but now subscripts indicate that the documents D a and terms w a are in Arabic, and the query terms e i are English: (12) P(w a |D a ) can still be estimated as in Equation (11), with the document and background models in Arabic. P(e i |D a ) is estimated as in Berger and Lafferty (1999) and Xu, et al, (2001):…”

Section: Query Expansion and Relevance Modelingmentioning

confidence: 99%

“…One other study has compared structured query translation with LM for cross-language retrieval (Xu, et al, 2001). They found that language modeling performed better than structured query translation when used with a dictionary derived from a parallel corpus, or a combined dictionary derived from parallel and nonparallel sources.…”

Section: Query Expansion and Relevance Modelingmentioning

confidence: 99%

“…Lavrenko et al (2002) compared relevance modeling with cross-lingual language modeling based on unexpanded queries, on the same Chinese data set used in the Xu, et al (2001) experiments, but given that relevance modeling is effectively a query expansion technique, we feel that it is fairer to compare it with another expansion technique.…”

Section: Query Expansion and Relevance Modelingmentioning

confidence: 99%

See 2 more Smart Citations

Structured queries, language modeling, and relevance modeling in cross-language information retrieval

Larkey

Connell

2005

Information Processing & Management

View full text Add to dashboard Cite

Two probabilistic approaches to cross-lingual retrieval are in wide use today, those based on probabilistic models of relevance, as exemplified by INQUERY, and those based on language modeling. INQUERY, as a query net model, allows the easy incorporation of query operators, including a synonym operator, which has proven to be extremely useful in cross-language information retrieval (CLIR), in an approach often called structured query translation. In contrast, language models incorporate translation probabilities into a unified framework. We compare the two approaches on Arabic and Spanish data sets, using two kinds of bilingual dictionaries -one derived from a conventional dictionary, and one derived from a parallel corpus. We find that structured query processing gives slightly better results when queries are not expanded. On the other hand, when queries are expanded, language modeling gives better results, but only when using a probabilistic dictionary derived from a parallel corpus.We pursue two additional issues inherent in the comparison of structured query processing with language modeling. The first concerns query expansion, and the second is the role of translation probabilities. We compare conventional expansion techniques (pseudo-relevance feedback) with relevance modeling, a new IR approach which fits into the formal framework of language modeling. We find that relevance modeling and pseudo-relevance feedback achieve comparable levels of retrieval and that good translation probabilities confer a small but significant advantage.

show abstract

Statistical language modeling for information retrieval

Liu¹,

Croft²

2005

Annual Review Info Sci & Tec

View full text Add to dashboard Cite

Evaluating a probabilistic model for cross-lingual information retrieval

Cited by 91 publications

References 15 publications

Probabilistic Structured Query Methods

Probabilistic Structured Query Methods

Structured queries, language modeling, and relevance modeling in cross-language information retrieval

Statistical language modeling for information retrieval

Contact Info

Product

Resources

About