2020
DOI: 10.1093/jamiaopen/ooz072
|View full text |Cite
|
Sign up to set email alerts
|

Adapting and evaluating a deep learning language model for clinical why-question answering

Abstract: ObjectivesTo adapt and evaluate a deep learning language model for answering why-questions based on patient-specific clinical text. Materials and MethodsBidirectional encoder representations from transformers (BERT) models were trained with varying data sources to perform SQuAD 2.0 style why-question answering (why-QA) on clinical notes. The evaluation focused on: 1) comparing the merits from different training data, 2) error analysis. ResultsThe best model achieved an accuracy of 0.707 (or 0.760 by partial ma… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
14
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 25 publications
(14 citation statements)
references
References 13 publications
0
14
0
Order By: Relevance
“…Transformer-based models have been wildly successful in setting state-of-the-art benchmarks on a broad range of natural language processing (NLP) tasks, including question answering, document classification, machine translation, text summarization, and others [1][2][3]. These successes have been replicated in the clinical and biomedical domain via pre-training language models using large-scale clinical or biomedical corpora, then fine-tuning on a variety of clinical or biomedical downstream tasks, including computational phenotyping [4], automatical ICD coding [5], knowledge graph completion [6] and clinical question answering [7].…”
Section: Introductionmentioning
confidence: 99%
“…Transformer-based models have been wildly successful in setting state-of-the-art benchmarks on a broad range of natural language processing (NLP) tasks, including question answering, document classification, machine translation, text summarization, and others [1][2][3]. These successes have been replicated in the clinical and biomedical domain via pre-training language models using large-scale clinical or biomedical corpora, then fine-tuning on a variety of clinical or biomedical downstream tasks, including computational phenotyping [4], automatical ICD coding [5], knowledge graph completion [6] and clinical question answering [7].…”
Section: Introductionmentioning
confidence: 99%
“…The size of the token dictionary used in the test used approximately 30,000 sub tokens in both the SentencePiece and ByteLevelBPE methods. The test was carried out with the learning rates of 1e-5 and 5e-5 for both languages according to the reference from the BERT paper [17], [22], [26].…”
Section: Methodsmentioning
confidence: 99%
“…In this study, we proposed RoBERTa − an artificial neural network based on transformers with 6 layers (base model). RoBERTa [22], [23] used the WordPiece tokenization technique in the pre-training stage and the training method used masked layer modelling (MLM) and NSP to support our QAS system. Our contribution are: 1) representing the extraction of language models at the character level for answer selection without any engineering features and linguistic tools; and 2) applying an efficient self-attention model to generate answers according to context by calculating the input and output representations regardless of word order.…”
mentioning
confidence: 99%
“…In this study, we focus on clinical reading comprehension task, which aims to extract a text span (a sentence or multiple sentences) as the answer from a patient clinical note given a question (Yue et al, 2020). Though many neural models (Seo et al, 2017;Rawat et al, 2020;Wen et al, 2020) have achieved impressive results on this task, their performance on new clinical contexts, whose data distributions could be different from the ones that these models were trained on, is still far from satisfactory (Yue et al, 2020). Though one can improve the performance by adding more QA pairs on new contexts into training, however, manually creating large-scale QA pairs in the clinical domain often involves tremendous expert effort and data privacy concerns.…”
Section: Related Workmentioning
confidence: 99%