Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1092
|View full text |Cite
|
Sign up to set email alerts
|

HEAD-QA: A Healthcare Dataset for Complex Reasoning

Abstract: We present HEAD-QA, a multi-choice question answering testbed to encourage research on complex reasoning. The questions come from exams to access a specialized position in the Spanish healthcare system, and are challenging even for highly specialized humans. We then consider monolingual (Spanish) and cross-lingual (to English) experiments with information retrieval and neural techniques. We show that: (i) HEAD-QA challenges current methods, and (ii) the results lag well behind human performance, demonstrating … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 26 publications
(24 citation statements)
references
References 21 publications
0
24
0
Order By: Relevance
“…These tasks are mostly in multiple-choice forms. In Table 1, we list several representative subjectarea multiple-choice QA datasets: NTCIR-11 QA-Lab (Shibuki et al, 2014), QS (Cheng et al, 2016), MCQA (Guo et al, 2017), ARC (Clark et al, 2018), GeoSQA (Huang et al, 2019b), HEAD-QA (Vilares and Gómez-Rodríguez, 2019), EX-AMS (Hardalov et al, 2020), JEC-QA (Zhong et al, 2020), and MEDQA (Jin et al, 2020). Some multiple-choice MRC datasets for Chinese such as C 3 are collected from language exams designed to test the reading comprehension ability of a human reader.…”
Section: Comparisons With Existing Subject-areamentioning
confidence: 99%
“…These tasks are mostly in multiple-choice forms. In Table 1, we list several representative subjectarea multiple-choice QA datasets: NTCIR-11 QA-Lab (Shibuki et al, 2014), QS (Cheng et al, 2016), MCQA (Guo et al, 2017), ARC (Clark et al, 2018), GeoSQA (Huang et al, 2019b), HEAD-QA (Vilares and Gómez-Rodríguez, 2019), EX-AMS (Hardalov et al, 2020), JEC-QA (Zhong et al, 2020), and MEDQA (Jin et al, 2020). Some multiple-choice MRC datasets for Chinese such as C 3 are collected from language exams designed to test the reading comprehension ability of a human reader.…”
Section: Comparisons With Existing Subject-areamentioning
confidence: 99%
“…Medical Question Answering: There have been efforts to create QA datasets for medical reasoning [21,22]. The Medication QA dataset [23] comprises nearly 700 drug-related consumer questions along with information retrieved from reliable websites and scientific papers.…”
Section: Related Workmentioning
confidence: 99%
“…[182] • Multiple Choice Question (MCQ) A few answer options are provided out of which one or more are correct. [188,117].…”
Section: Classification By Answer Typementioning
confidence: 99%
“…• Medical Examination: These datasets are constructed from questions asked in medical certification or other associated exams. These comprise of clinical questions which needs both knowledge and logical reasoning to answer [188].…”
Section: Classification By Sub-domainmentioning
confidence: 99%