Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval 2021
DOI: 10.1145/3404835.3463076
|View full text |Cite
|
Sign up to set email alerts
|

Improving Bi-encoder Document Ranking Models with Two Rankers and Multi-teacher Distillation

Abstract: BERT-based Neural Ranking Models (NRMs) can be classified according to how the query and document are encoded through BERT's self-attention layers -bi-encoder versus cross-encoder. Biencoder models are highly efficient because all the documents can be pre-processed before the query time, but their performance is inferior compared to cross-encoder models. Both models utilize a ranker that receives BERT representations as the input and generates a relevance score as the output. In this work, we propose a method … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(10 citation statements)
references
References 10 publications
1
9
0
Order By: Relevance
“…While BERT and CONV-KNRM also perform decently well, ColBERT doesn't perform as good as the other models. Our numbers for ColBERT are consistent with another recent study [5] that also trained ColBERT on Robust04 and ClueWeb09-Cat-B.…”
Section: Relevance and Inference Efficiencysupporting
confidence: 89%
See 1 more Smart Citation
“…While BERT and CONV-KNRM also perform decently well, ColBERT doesn't perform as good as the other models. Our numbers for ColBERT are consistent with another recent study [5] that also trained ColBERT on Robust04 and ClueWeb09-Cat-B.…”
Section: Relevance and Inference Efficiencysupporting
confidence: 89%
“…Following previous work [34], for Robust04 and ClueWeb09-Cat-B, we re-rank the 150 documents per query retrieved by Indri initial ranking in the Lemur system 5 . For the MS MARCO Dev set, we follow a common practice to re-rank top 1000 passages per query retrieved by BM25.…”
Section: Relevance and Inference Efficiencymentioning
confidence: 99%
“…Another popular choice is ColBERT (Khattab and Zaharia, 2020), whose structure is more similar to dual-encoders, and thus allows KD on in-batch negative examples . Besides, a handful of studies also try to improve the performance with multi-teacher distillation (Choi et al, 2021;Hofstätter et al, 2021). However, none of them investigate how to more effectively distill the knowledge of teachers into a student with different architecture.…”
Section: Knowledge Distillation For Retrieversmentioning
confidence: 99%
“…Cross-encoder. Researchers have developed passage re-ranking models (i.e., re-rankers) further to improve end-to-end QA after the retrieval of candidate passages (Choi et al, 2021;Ren et al, 2021b;. Using a cross-encoder as a re-ranker usually achieves superior performance.…”
Section: Preliminariesmentioning
confidence: 99%
See 1 more Smart Citation