End-to-End Training of Neural Retrievers for Open-Domain Question Answering

Sachan, Devendra Singh; Patwary, Mostofa; Shoeybi, Mohammad; Kant, Neel; Ping, Wei; Hamilton, William L.; Catanzaro, Bryan

doi:10.18653/v1/2021.acl-long.519

Cited by 49 publications

(57 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While this is a valid mechanism, perhaps conditioning on individual passages like we do is more precise for relevance supervision. Indeed, recent work (Sachan et al, 2021) illustrates by using Fusion-in-Decoder during inference but foregoing the decoder's attention weights and using an equivalent version of the MARGINALIZEDLOSS for training the retriever. Furthermore, Fusion-in-Decoder is uniquely useful for QA style tasks, where it has to select the correct answer from many passages.…”

Section: Discussionmentioning

confidence: 99%

Hindsight: Posterior-guided training of retrievers for improved open-ended generation

Paranjape¹,

Khattab²,

Potts³

et al. 2021

Preprint

View full text Add to dashboard Cite

Many text generation systems benefit from using a retriever to retrieve passages from a textual knowledge corpus (e.g., Wikipedia) and providing these passages as additional context to the generator. For open-ended generation tasks (like generating informative utterances in conversations) many varied passages may be equally relevant and we find that existing methods that jointly train the retriever and generator underperform: the retriever may not find relevant passages even amongst the top-10 and the generator may hence not learn a preference to ground its generated output in them. We propose using an additional guide retriever that is allowed to use the target output and "in hindsight" retrieve relevant passages during training. We model the guide retriever after the posterior distribution Q of passages given the input and the target output and train it jointly with the standard retriever and the generator by maximizing the evidence lower bound (ELBo) in expectation over Q. For informative conversations from the Wizard of Wikipedia dataset, with posterior-guided training, the retriever finds passages with higher relevance in the top-10 (23% relative improvement), the generator's responses are more grounded in the retrieved passage (19% relative improvement) and the end-to-end system produces better overall output (6.4% relative improvement).

show abstract

Section: Discussionmentioning

confidence: 99%

Hindsight: Posterior-guided training of retrievers for improved open-ended generation

Paranjape¹,

Khattab²,

Potts³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Pretraining for dense retrieval has recently gained a considerable attention, following the success of self-supervised models in many NLP tasks Liu et al, 2019;Brown et al, 2020). While most works focus on fine-tuning such retrievers on large datasets after pretraining Guu et al, 2020;Sachan et al, 2021;Gao and Callan, 2021a), we attempt to bridge the gap between unsupervised dense models and strong sparse (e.g., BM25; Robertson and Zaragoza 2009) or supervised dense baselines (e.g., DPR; Karpukhin et al 2020). A concurrent work by Oguz et al (2021) presented DPR-PAQ, which shows strong results on NQ after pretraining.…”

Section: Related Workmentioning

confidence: 98%

“…This is done mainly to avoid uninformative recurring words, e.g., verbs or adjectives. Note that as opposed to other approaches for span filtering (Glass et al, 2020;Guu et al, 2020;Sachan et al, 2021), our heuristics do not require any model.…”

Section: Pretraining: Recurring Span Retrievalmentioning

confidence: 99%

Learning to Retrieve Passages without Supervision

Ram¹,

Shachaf²,

Levy³

et al. 2021

Preprint

View full text Add to dashboard Cite

Dense retrievers for open-domain question answering (ODQA) have been shown to achieve impressive performance by training on large datasets of question-passage pairs. We investigate whether dense retrievers can be learned in a self-supervised fashion, and applied effectively without any annotations. We observe that existing pretrained models for retrieval struggle in this scenario, and propose a new pretraining scheme designed for retrieval: recurring span retrieval. We use recurring spans across passages in a document to create pseudo examples for contrastive learning. The resulting model -Spider -performs surprisingly well without any examples on a wide range of ODQA datasets, and is competitive with BM25, a strong sparse baseline. In addition, Spider often outperforms strong baselines like DPR trained on Natural Questions, when evaluated on questions from other datasets. Our hybrid retriever, which combines Spider with BM25, improves over its components across all datasets, and is often competitive with indomain DPR models, which are trained on tens of thousands of examples. 1

show abstract

“…Most of the OpenQA models also consist of a retriever and a reasoner. The retriever is devised as a sparse term-based method such as BM25 (Robertson and Zaragoza, 2009) or a trainable dense passage retrieval method (Karpukhin et al, 2020;Sachan et al, 2021a), and the reasoner deals with each doc-ument individually (Guu et al, 2020) or fuses all the documents together (Izacard and Grave, 2021). Different from the documents in openQA, the subgraphs in KBQA can be only obtained by multi-hop retrieval and the reasoner should deal with the entire subgraph instead of each individual relation to find the answer.…”

Section: Related Workmentioning

confidence: 99%

Subgraph Retrieval Enhanced Model for Multi-hop Knowledge Base Question Answering

Zhang,

et al. 2022

Preprint

View full text Add to dashboard Cite

Recent works on knowledge base question answering (KBQA) retrieve subgraphs for easier reasoning. A desired subgraph is crucial as a small one may exclude the answer but a large one might introduce more noises. However, the existing retrieval is either heuristic or interwoven with the reasoning, causing reasoning on the partial subgraphs, which increases the reasoning bias when the intermediate supervision is missing. This paper proposes a trainable subgraph retriever (SR) decoupled from the subsequent reasoning process, which enables a plug-and-play framework to enhance any subgraph-oriented KBQA model. Extensive experiments demonstrate SR achieves significantly better retrieval and QA performance than existing retrieval methods. Via weakly supervised pre-training as well as the end-to-end fine-tuning, SR achieves new state-of-the-art performance when combined with NSM (He et al., 2021), a subgraph-oriented reasoner, for embedding-based KBQA methods. Codes and datasets are available online 1 .

show abstract

End-to-End Training of Neural Retrievers for Open-Domain Question Answering

Cited by 49 publications

References 14 publications

Hindsight: Posterior-guided training of retrievers for improved open-ended generation

Hindsight: Posterior-guided training of retrievers for improved open-ended generation

Learning to Retrieve Passages without Supervision

Subgraph Retrieval Enhanced Model for Multi-hop Knowledge Base Question Answering

Contact Info

Product

Resources

About