2021
DOI: 10.48550/arxiv.2112.07577
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval

Abstract: Dense retrieval approaches can overcome the lexical gap and lead to significantly improved search results. However, they require large amounts of training data which is not available for most domains. As shown in previous work (Thakur et al., 2021b), the performance of dense retrievers severely degrades under a domain shift. This limits the usage of dense retrieval approaches to only a few domains with large training datasets.In this paper, we propose the novel unsupervised domain adaptation method Generative … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(18 citation statements)
references
References 20 publications
0
17
0
1
Order By: Relevance
“…For example, the model is trained on Web search data and is transferred to medical domains. It has drawn considerable attention from the neural IR community [23,39,45,46]. To provide a deeper understanding of extrapolation evaluation, we investigate how benchmark extrapolation results correlate domain transfer ability (RQ3).…”
Section: Relationship With Transfer Abilitymentioning
confidence: 99%
“…For example, the model is trained on Web search data and is transferred to medical domains. It has drawn considerable attention from the neural IR community [23,39,45,46]. To provide a deeper understanding of extrapolation evaluation, we investigate how benchmark extrapolation results correlate domain transfer ability (RQ3).…”
Section: Relationship With Transfer Abilitymentioning
confidence: 99%
“…Focusing on improving the transfer learning effectiveness of dense retrievers, Ma et al (2021) and Wang et al (2021) use supervised sequence-to-sequence models to augment the training data. They generate questions from texts from different collections and use these synthetic question-text pairs as positive training examples.…”
Section: Related Workmentioning
confidence: 99%
“…For a fair comparison, Table 6 only includes models that use the same baseline training strategy as ours. Thus, we exclude approaches that depend on other models for expansion [25,33,51], costly training techniques such as knowledge distillation [9,17,18,38,41,44], or special pretraining [11,20,34] (see Table 8 for more comparisons).…”
Section: Evaluation Of Single Model Fusionmentioning
confidence: 99%
“…Even considering a general method such as knowledge distillation (KD), there are many different strategies. For example, we use a lightweight ColBERT teacher, while GPL [44], ColBERTv2 [41], and SPLADEv2 [9] use a more expensive cross-encoder teacher. First, a comparison with Contriever [20], GPL [44], and GTR [34], in columns (e)-(i): We observe that the generalization capability of different retrievers can be improved by increasing the model size.…”
Section: Evaluation Of Single Model Fusionmentioning
confidence: 99%
See 1 more Smart Citation