Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018
DOI: 10.18653/v1/p18-1177
|View full text |Cite
|
Sign up to set email alerts
|

Harvesting Paragraph-level Question-Answer Pairs from Wikipedia

Abstract: We study the task of generating from Wikipedia articles question-answer pairs that cover content beyond a single sentence. We propose a neural network approach that incorporates coreference knowledge via a novel gating mechanism. Compared to models that only take into account sentence-level information (Heilman and Smith, 2010;, we find that the linguistic knowledge introduced by the coreference representation aids question generation significantly, producing models that outperform the current state-of-theart.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
139
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 148 publications
(139 citation statements)
references
References 23 publications
0
139
0
Order By: Relevance
“…Previous work on question generation has made extensive use of MT techniques. Du et al (2017) use a Seq2Seq based model to generate questions conditioned on context-answer pairs, and build on this work by preprocessing the context to resolve coreferences and adding a pointer network (Du and Cardie, 2018). Similarly, Zhou et al (2018) use a part-of-speech tagger to augment the embedding vectors.…”
Section: Introductionmentioning
confidence: 99%
“…Previous work on question generation has made extensive use of MT techniques. Du et al (2017) use a Seq2Seq based model to generate questions conditioned on context-answer pairs, and build on this work by preprocessing the context to resolve coreferences and adding a pointer network (Du and Cardie, 2018). Similarly, Zhou et al (2018) use a part-of-speech tagger to augment the embedding vectors.…”
Section: Introductionmentioning
confidence: 99%
“…While WS-TB is related to the approaches mentioned before, DQG is is also related to question generation (QG). Most of the previous work in QG is in the context of reading comprehension (e.g., Du et al, 2017;Subramanian et al, 2018;Zhao et al, 2018;Du and Cardie, 2018) or QG for question answering (Duan et al, 2017). They substantially differ from our approach because they generate questions based on specific answer spans, while DQG generates a new title from a question's body that can be used as a question duplicate.…”
Section: Duplicates Answers Bodiesmentioning
confidence: 96%
“…The problem is usually formulated as answeraware question generation, where the position of answer is provided as input. Most of them take advantage of the encoder-decoder framework with attention mechanism [10,11,22,27,39,41,53]. Different approaches incorporate the answer information into generation model by different strategies, such as answer position indicator [27,53], separated answer encoding [23], embedding the relative distance between the context words and the answer [42] and so on.…”
Section: Related Workmentioning
confidence: 99%
“…Automatically generating question-answer pairs from unlabeled text passages is of great value to many applications, such as assisting the training of machine reading comprehension systems [10,44,45], generating queries/questions from documents to improve search engines [17], training chatbots to get and keep a conversation going [40], generating exercises for educational purposes [7,18,19], and generating FAQs for web documents [25]. Many question-answering tasks such as machine reading comprehension and chatbots require a large amount of labeled samples for supervised training, acquiring which is time-consuming and costly.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation