Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1612
|View full text |Cite
|
Sign up to set email alerts
|

Latent Retrieval for Weakly Supervised Open Domain Question Answering

Abstract: Recent work on open domain question answering (QA) assumes strong supervision of the supporting evidence and/or assumes a blackbox information retrieval (IR) system to retrieve evidence candidates. We argue that both are suboptimal, since gold evidence is not always available, and QA is fundamentally different from IR. We show for the first time that it is possible to jointly learn the retriever and reader from question-answer string pairs and without any IR system. In this setting, evidence retrieval from all… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
292
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 462 publications
(294 citation statements)
references
References 30 publications
2
292
0
Order By: Relevance
“…BM25-based methods remain to be the mainstream methods for document retrieval in industry. Previous work in open domain question answering has shown that BM25 is a difficult baseline to surpass when questions were written by workers who have prior knowledge of the answer (Lee et al, 2019a). We will leave more comprehensive comparisons against other learning-based methods to future work, since the main goal of this demo paper is to present the system along with its dataset.…”
Section: Results For Soco-qa Performancementioning
confidence: 99%
See 1 more Smart Citation
“…BM25-based methods remain to be the mainstream methods for document retrieval in industry. Previous work in open domain question answering has shown that BM25 is a difficult baseline to surpass when questions were written by workers who have prior knowledge of the answer (Lee et al, 2019a). We will leave more comprehensive comparisons against other learning-based methods to future work, since the main goal of this demo paper is to present the system along with its dataset.…”
Section: Results For Soco-qa Performancementioning
confidence: 99%
“…has shown that using paragraphs as the unit of passage outperform sentences or documents. Lee et al (2019a) proposes a trainable first-stage retriever that improves the recall performance. Pipeline-based system often suffer from error propagation (Zhao and Eskenazi, 2016).…”
Section: Related Workmentioning
confidence: 99%
“…Knowledge Incorporation of knowledge into language models has shown promising results for downstream tasks, such as factual correct generation (Logan et al, 2019) , commonsense knowledge graph construction (Bosselut et al, 2019), entity typing (Zhang et al, 2019) and etc. More recently, several works have shown that inclusion of learned mechanisms for explicit or implicit knowledge can lead to the state-of-the-art results in Question Answering (Guu et al, 2020;Karpukhin et al, 2020;Lee et al, 2019;Lewis et al, 2020) and dialogue modeling (Roller et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Pseudoquery is a declarative sentence; it is different from the actual query, which is an interrogative sentence. ORQA, which uses learned ICT with pseudo-data to predict the context related to the query, performed better than the baseline model [43]. Pseudo-evidence consists of the surrounding sentences of the pseudo-query that are not the context that contains the information about query.…”
Section: ) Evidence Extractionmentioning
confidence: 96%
“…Here, the unsupervised Inverse Cloze Task (ICT) proposed by the Open Retrieval Question Answering System (ORQA) [43] is used to confirm the relevance of the paragraph and query. ICT is a task that finds related context for a sentence, which is the inverse of Cloze task [44].…”
Section: ) Evidence Extractionmentioning
confidence: 99%