Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing 2017
DOI: 10.18653/v1/d17-1082
|View full text |Cite
|
Sign up to set email alerts
|

RACE: Large-scale ReAding Comprehension Dataset From Examinations

Abstract: We present RACE, a new dataset for benchmark evaluation of methods in the reading comprehension task. Collected from the English exams for middle and high school Chinese students in the age range between 12 to 18, RACE consists of near 28,000 passages and near 100,000 questions generated by human experts (English instructors), and covers a variety of topics which are carefully designed for evaluating the students' ability in understanding and reasoning. In particular, the proportion of questions that requires … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
733
1
2

Year Published

2018
2018
2019
2019

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 783 publications
(736 citation statements)
references
References 19 publications
0
733
1
2
Order By: Relevance
“…In recent years, more and more large-scale RC datasets became available. These datasets focus on different types of RC tasks, such as cloze-style RC (Hermann et al, 2015;Hill et al, 2016), span-based RC with or without unanswerable questions (Rajpurkar et al, 2016(Rajpurkar et al, , 2018 and multi-choice RC (Lai et al, 2017). Some tasks require the model to answer yes/no questions in addition to spans (Reddy et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In recent years, more and more large-scale RC datasets became available. These datasets focus on different types of RC tasks, such as cloze-style RC (Hermann et al, 2015;Hill et al, 2016), span-based RC with or without unanswerable questions (Rajpurkar et al, 2016(Rajpurkar et al, , 2018 and multi-choice RC (Lai et al, 2017). Some tasks require the model to answer yes/no questions in addition to spans (Reddy et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…We evaluate our method on the representative datasets SQuAD1.1 (Rajpurkar et al, 2016), SQuAD2.0 (Rajpurkar et al, 2018) and RACE (Lai et al, 2017). The passages in SQuAD1.1 are retrieved from Wikipedia articles and the questions are crafted by crowd-workers.…”
Section: Datasetsmentioning
confidence: 99%
“…Recently, multiple datasets have been proposed for multi-hop QA, in which questions can only be answered when considering information from multiple sentences and/or documents Khashabi et al, 2018a;Welbl et al, 2018;Mihaylov et al, 2018;Bauer et al, 2018;Dunn et al, 2017;Dhingra et al, 2017;Lai et al, 2017;Rajpurkar et al, 2018;. The task of selecting justification sentences is complex for multi-hop QA, because of the additional knowledge aggregation requirement (examples of such questions and answers are shown in Figures 1 and 2).…”
Section: Introductionmentioning
confidence: 99%
“…In our dataset, in contrast, answering requires drawing inferences using knowledge not explicit in the text. Another recently published multiple choice dataset is RACE (Lai et al, 2017), which contains 100,000 questions on reading examination data. Rajpurkar et al (2016) have proposed the Stanford Question Answering Dataset (SQuAD), a data set of 100,000 questions on Wikipedia articles collected via crowdsourcing.…”
Section: Related Workmentioning
confidence: 99%