Proceedings of the 20th Workshop on Biomedical Language Processing 2021
DOI: 10.18653/v1/2021.bionlp-1.7
|View full text |Cite
|
Sign up to set email alerts
|

emrKBQA: A Clinical Knowledge-Base Question Answering Dataset

Abstract: We present emrKBQA, a dataset for answering physician questions from a structured patient record. It consists of questions, logical forms and answers. The questions and logical forms are generated based on real-world physician questions and are slot-filled and answered from patients in the MIMIC-III KB (Johnson et al., 2016) through a semi-automated process. This community-shared release consists of over 940000 question, logical form and answer triplets with 389 types of questions and ≈7.5 paraphrases per ques… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(9 citation statements)
references
References 15 publications
0
9
0
Order By: Relevance
“…But as far as the authors' knowledge, so far there is no multi-modal clinical dataset that encorporates structured and unstructured EHR data for QA. QA in EHRs has been limited to QA over knowledge bases (Wang et al, 2021), EHR tables (Wang et al, 2020b;Raghavan et al, 2021) or clinical notes (Johnson et al, 2016b;Pampari et al, 2018). emrQA (Pampari et al, 2018) and Clin-iQG4QA (Yue et al, 2021) There are QA datasets that are generated using templatebased method like MIMICSQL (Wang et al, 2020b) and emrKBQA (Raghavan et al, 2021) which utilize the structured EHR tables of MIMIC-III for QA.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…But as far as the authors' knowledge, so far there is no multi-modal clinical dataset that encorporates structured and unstructured EHR data for QA. QA in EHRs has been limited to QA over knowledge bases (Wang et al, 2021), EHR tables (Wang et al, 2020b;Raghavan et al, 2021) or clinical notes (Johnson et al, 2016b;Pampari et al, 2018). emrQA (Pampari et al, 2018) and Clin-iQG4QA (Yue et al, 2021) There are QA datasets that are generated using templatebased method like MIMICSQL (Wang et al, 2020b) and emrKBQA (Raghavan et al, 2021) which utilize the structured EHR tables of MIMIC-III for QA.…”
Section: Related Workmentioning
confidence: 99%
“…QA in EHRs has been limited to QA over knowledge bases (Wang et al, 2021), EHR tables (Wang et al, 2020b;Raghavan et al, 2021) or clinical notes (Johnson et al, 2016b;Pampari et al, 2018). emrQA (Pampari et al, 2018) and Clin-iQG4QA (Yue et al, 2021) There are QA datasets that are generated using templatebased method like MIMICSQL (Wang et al, 2020b) and emrKBQA (Raghavan et al, 2021) which utilize the structured EHR tables of MIMIC-III for QA. emrKBQA contains 940,000 questions, logical forms and answers which uses the structured records of MIMIC-III.…”
Section: Related Workmentioning
confidence: 99%
“…Recent works on EHR-QA with structured data (e.g., relational database or knowledge graph) have been focused on converting natural language questions (NLQ) into query languages such as SQL or SPARQL (Wang et al, 2020;Park et al, 2021;Bae et al, 2021) or into domain-specific forms (Raghavan et al, 2021). However, because all previous works mentioned above rely on specific query languages, the problem scope is limited to pre-defined data types (e.g., string, int, timestamp) and operations.…”
Section: Introductionmentioning
confidence: 99%
“…Applying methods in natural language processing to the EHR is a growing field with many potential applications in clinical decision support and augmented care. Corpus and annotation on EHR data are created to model semantic features and relation through linguistic cues, including relation extraction (Mowery et al, 2008), named entity recognition (Wang, 2009;Patel et al, 2018;Lybarger et al, 2021), question answering (Pampari et al, 2018;Raghavan et al, 2021), natural language inference (Romanov and Shivade, 2018), etc. However, few corpora have been built to model clinical thinking, especially about clinical diagnostic reasoning, a process involving clinical evidence acquisition, generating hypothesis, integration and abstraction over medical knowledge and synthesizing a conclusion in the form of a diagnosis and treatment plan (Bowen, 2006).…”
Section: Introductionmentioning
confidence: 99%