Proceedings of the Web Conference 2020 2020
DOI: 10.1145/3366423.3380131
|View full text |Cite
|
Sign up to set email alerts
|

Selective Weak Supervision for Neural Information Retrieval

Abstract: This paper democratizes neural information retrieval to scenarios where large scale relevance training signals are not available. We revisit the classic IR intuition that anchor-document relations approximate query-document relevance and propose a reinforcement weak supervision selection method, ReInfoSelect, which learns to select anchor-document pairs that best weakly supervise the neural ranker (action), using the ranking performance on a handful of relevance labels as the reward. Iteratively, for a batch o… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
42
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 39 publications
(44 citation statements)
references
References 39 publications
2
42
0
Order By: Relevance
“…Rücklé et al (2019b) use weakly supervised training, self-supervised training methods, and question generation. Similar approaches were also explored in ad-hoc retrieval (Zhang et al, 2020;Ma et al, 2020;MacAvaney et al, 2019). A crucial limitation of these approaches is that they result in entirely separate models for each dataset and are thus not re-usable.…”
Section: Related Workmentioning
confidence: 99%
“…Rücklé et al (2019b) use weakly supervised training, self-supervised training methods, and question generation. Similar approaches were also explored in ad-hoc retrieval (Zhang et al, 2020;Ma et al, 2020;MacAvaney et al, 2019). A crucial limitation of these approaches is that they result in entirely separate models for each dataset and are thus not re-usable.…”
Section: Related Workmentioning
confidence: 99%
“…In our experiments with TREC Deep Learning Track and three more fewshot document ranking benchmarks (Zhang et al, 2020), QDS-Transformer consistently improves the standard retrofitting BERT ranking baselines (e.g., max-pooling on paragraphs) by 5% NDCG. It also shows gains over more recent transformer architectures that induces various sparse structures, including Sparse Transformer, Longformer, and Transformer-XH, as they were not designed to incorporate the essential information required in document ranking.…”
Section: Introductionmentioning
confidence: 80%
“…Few-shot Document Ranking. All experimental settings for few-shot learning are consistent with the"MS MARCO Human Labels" setting in previous studies (Zhang et al, 2020). Each method first trains a neural ranker on MARCO training labels, which are identical as in the TREC DL track.…”
Section: Discussionmentioning
confidence: 99%
“…[8] generate pseudo-qrels from a news collection, using the titles as pseudo-queries and their content as relevant text. Other authors [2,18] use the signal produced by anchor-document relationships to simulate qrels.…”
Section: Related Workmentioning
confidence: 99%