2013
DOI: 10.1007/978-3-642-36973-5_80
|View full text |Cite
|
Sign up to set email alerts
|

Domain Adaptation of Statistical Machine Translation Models with Monolingual Data for Cross Lingual Information Retrieval

Abstract: This work proposes to adapt an existing general SMT model for the task of translating queries that are subsequently going to be used to retrieve information from a target language collection. In the scenario that we focus on access to the document collection itself is not available and changes to the IR model are not possible. We propose two ways to achieve the adaptation effect and both of them are aimed at tuning parameter weights on a set of parallel queries. The first approach is via a standard tuning proc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 17 publications
(19 citation statements)
references
References 25 publications
0
19
0
Order By: Relevance
“…One highly relevant study which explicitly deals with genre adaptation is by Nikoulina et al [38]. They adapt the Moses SMT toolkit to translate queries for cross-lingual IR applied on the CLEF Ad Hoc TEL 2009 test collection of bibliography entries.…”
Section: Genre Adaptationmentioning
confidence: 99%
“…One highly relevant study which explicitly deals with genre adaptation is by Nikoulina et al [38]. They adapt the Moses SMT toolkit to translate queries for cross-lingual IR applied on the CLEF Ad Hoc TEL 2009 test collection of bibliography entries.…”
Section: Genre Adaptationmentioning
confidence: 99%
“…This usually restricts the search space for oracle translations to the k-best list of derivations [10]. To alleviate this problem, we abstract away from the ranking problem and approximate retrieval quality of a translation q with its relevance score S rel (q, C + f ) to the set of relevant documents C…”
Section: Oracle Query Translationsmentioning
confidence: 99%
“…We address this problem by discriminative training techniques which are widely used in the SMT community, and use automatically constructed relevance judgments from linked data. We show that a decomposable proxy for retrieval quality in training alleviates the problem of a costly intermediate retrieval step in reranking frameworks [10], and allows us to make use of the full, and lexically more diverse, decoder search space to optimize query translations for the CLIR task.…”
Section: Introductionmentioning
confidence: 99%
“…Magdy et al [17] showed that preprocessing text consistently for MT and IR systems is beneficial. Nikoulina et al [19] built MT models tailored to query translation by tuning model weights with queries and reranking the top n translations to maximize effectiveness on a held-out query set. While improvements were more substantial using the latter method, another interesting finding was the low correlation between translation and retrieval quality.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Despite the prevalence of context-independent word-based approaches for cross-language information retrieval (CLIR) derived from the IBM translation models [4], recent studies have shown that exploiting ideas from machine translation (MT) for context-sensitive query translation produces higher-quality results [17,19,24]. State-of-the-art MT systems take advantage of sophisticated models with "deeper" representations of translation units, e.g., phrase-based [13], syntax-based [25,27], and even semantics-based [11] models.…”
Section: Introductionmentioning
confidence: 99%