2020
DOI: 10.2196/22508
|View full text |Cite
|
Sign up to set email alerts
|

Identification of Semantically Similar Sentences in Clinical Notes: Iterative Intermediate Training Using Multi-Task Learning

Abstract: Background Although electronic health records (EHRs) have been widely adopted in health care, effective use of EHR data is often limited because of redundant information in clinical notes introduced by the use of templates and copy-paste during note generation. Thus, it is imperative to develop solutions that can condense information while retaining its value. A step in this direction is measuring the semantic similarity between clinical text snippets. To address this problem, we participated in th… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(17 citation statements)
references
References 49 publications
0
17
0
Order By: Relevance
“…Our best performing model, the mean_score ensemble, achieved a correlation of 0.87, reaching 6th place out of 33 teams in the n2c2 2019 Track 1 task. The best model on the task achieved a correlation of 0.9 [ 37 ]. Our results are presented in Table 1 .…”
Section: Resultsmentioning
confidence: 99%
“…Our best performing model, the mean_score ensemble, achieved a correlation of 0.87, reaching 6th place out of 33 teams in the n2c2 2019 Track 1 task. The best model on the task achieved a correlation of 0.9 [ 37 ]. Our results are presented in Table 1 .…”
Section: Resultsmentioning
confidence: 99%
“…MedSTS was used as the gold standard in two clinical NLP open challenges including the 2018 BioCreative/Open Health NLP (OHNLP) challenge 65 and 2019 n2c2/OHNLP ClinicalSTS shared task 43 . Similar to the general domain, pretrained transformer-based models using clinical text and biomedical literature, including ClinicalBERT and BioBERT 66 , are state-of-the-art solutions. In this study, we used the dataset developed by the 2019 n2c2/OHNLP challenge on clinical semantic textural similarity 43 .…”
Section: Methodsmentioning
confidence: 99%
“…MedSTS was used as the gold standard in two clinical NLP open challenges including the 2018 BioCreative/Open Health NLP (OHNLP) challenge[56] and 2019 n2c2/OHNLP ClinicalSTS shared task[57]. Similar to the general domain, pretrained transformer-based models using clinical text and biomedical literature, including ClinicalBERT and BioBERT[58], are current solutions for STS. NLI is also known as recognizing textual entailment (RTE) - a directional relation between text fragments (e.g., sentences)[59].…”
Section: Introductionmentioning
confidence: 99%
“…Similarly to EE data sets, SS data sets are typically small, so the best approach appears to be pre-training models on SS data sets before fine-tuning on more generalised clinical data sets [56] . A Pearson correlation score of 0.83 was also achieved by fine-tuning Clinical BERT using combination of SS and clinical data sets [53] .…”
Section: Nlp Task Benchmarking For Covid-19 Literature Extractionmentioning
confidence: 99%