EHR2CCAS: A framework for mapping EHR to disease knowledge presenting causal chain of disorders – chronic kidney disease example

Ma, Xiaojun; Imai, T.; Shinohara, Emiko; Kasai, Satoshi; Kato, Kosuke; Kagawa, Rina

doi:10.1016/j.jbi.2021.103692

Cited by 9 publications

(6 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In recent years, there has been a call for “deep phenotyping” methods that move beyond identification of binary phenotypes to characterization of more complex phenotypes such as timing or severity of a condition [24,44–46,115,156]. While our study indicates that existing literature remains focused on characterizing binary phenotypes, several preprints consider severity and temporal phenotyping [36,157,158].…”

Section: Discussionmentioning

confidence: 97%

Machine Learning Approaches for Electronic Health Records Phenotyping: A Methodical Review

Yang

Varghese²,

Stephenson

et al. 2022

Preprint

View full text Add to dashboard Cite

ObjectiveAccurate and rapid methods for phenotyping are a prerequisite to realizing the potential of electronic health records (EHRs) data for clinical and translational research. This study reviews the literature on machine learning (ML) approaches for phenotyping with respect to the phenotypes considered, the data sources and methods used, and the contributions within the wider context of EHR-based research.Materials and MethodsWe searched for relevant articles in PubMed and Web of Science published between January 1, 2018 and April 14, 2022. After screening, we collected data on 52 variables across 106 selected articles.ResultsML-based methods were developed for 156 unique phenotypes, primarily using EHR data from a single institution or health system. 72 of 106 articles leveraged unstructured data in clinical notes. In terms of methodology, supervised learning is the most prevalent ML paradigm (n = 64, 60.4%), with half of the articles employing deep learning. Semi-supervised and weakly-supervised approaches were applied to reduce the burden of obtaining gold-standard labeled data (n = 21, 19.8%), while unsupervised learning was used for phenotype discovery (n = 20, 18.9%). Federated learning has been applied to develop algorithms across multiple institutions while preserving data privacy (n = 2, 1.9%).DiscussionWhile the use of ML for phenotyping is growing, most articles applied traditional supervised ML to characterize the presence of common, chronic conditions.ConclusionContinued research in ML-based methods is warranted, with particular attention to the development of advanced methods for complex phenotypes and standards for reporting and evaluating phenotyping algorithms.

show abstract

Section: Discussionmentioning

confidence: 97%

Machine Learning Approaches for Electronic Health Records Phenotyping: A Methodical Review

Yang

Varghese²,

Stephenson

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Among Japanese NLP studies focused on medical issues, Imai et al [4] developed a system that performs extraction and P/N classification of malignant findings from radiological reports such as CT reports and MRI reports; Ma et al [5] built a system that performs extraction and P/N classification of abnormal findings from discharge summaries, progress notes, and nursery notes; and Aramaki et al [6] developed a system that performs extraction and P/N classification of disease names and symptoms from case history summaries. In addition, Mashima et al [7] extracted adverse events from progress notes about patients who received intravenous injections of cytotoxic anticancer drugs, and Usui et al [8] extracted symptomatic states from data stored in the electronic medication records of a community pharmacy and standardized them according to the codes of the International Classification of Diseases, Tenth Revision in order to create a dataset of patients' complaints.…”

Section: Related Studymentioning

confidence: 99%

“…The possibilities for the use of NER in healthcare are broad and varied, as shown by the various efforts undertaken in previous studies [4][5][6][7][8][9][10]. Because pharmaceutical care records contain a large amount of information on adverse drug effects, it should be possible to alert healthcare professionals when symptoms of possible adverse drug reactions are extracted with reference to the attached document information.…”

Section: Future Utilizationmentioning

confidence: 99%

Using the Natural Language Processing System Medical Named Entity Recognition-Japanese to Analyze Pharmaceutical Care Records: Natural Language Processing Analysis

Ohno,

Kato,

Ishikawa

et al. 2024

JMIR Form Res

View full text Add to dashboard Cite

Background Large language models have propelled recent advances in artificial intelligence technology, facilitating the extraction of medical information from unstructured data such as medical records. Although named entity recognition (NER) is used to extract data from physicians’ records, it has yet to be widely applied to pharmaceutical care records. Objective In this study, we aimed to investigate the feasibility of automatic extraction of the information regarding patients’ diseases and symptoms from pharmaceutical care records. The verification was performed using Medical Named Entity Recognition-Japanese (MedNER-J), a Japanese disease-extraction system designed for physicians’ records. Methods MedNER-J was applied to subjective, objective, assessment, and plan data from the care records of 49 patients who received cefazolin sodium injection at Keio University Hospital between April 2018 and March 2019. The performance of MedNER-J was evaluated in terms of precision, recall, and F1-score. Results The F1-scores of NER for subjective, objective, assessment, and plan data were 0.46, 0.70, 0.76, and 0.35, respectively. In NER and positive-negative classification, the F1-scores were 0.28, 0.39, 0.64, and 0.077, respectively. The F1-scores of NER for objective (0.70) and assessment data (0.76) were higher than those for subjective and plan data, which supported the superiority of NER performance for objective and assessment data. This might be because objective and assessment data contained many technical terms, similar to the training data for MedNER-J. Meanwhile, the F1-score of NER and positive-negative classification was high for assessment data alone (F1-score=0.64), which was attributed to the similarity of its description format and contents to those of the training data. Conclusions MedNER-J successfully read pharmaceutical care records and showed the best performance for assessment data. However, challenges remain in analyzing records other than assessment data. Therefore, it will be necessary to reinforce the training data for subjective data in order to apply the system to pharmaceutical care records.

show abstract

“…• detection of events [29,30], including self-harm events [31]; • extraction of diagnoses [13,[32][33][34][35] and their codes [36][37][38]; • recognition of named entities [5,14,[39][40][41], and more specifically of personal information [21,42,43] and family history [20]; • localization of advices [44] and arguments [45] in scientific literature; • extraction of relations [46][47][48], including temporal [49] and causality [50,51] relations.…”

Section: Information Extractionmentioning

confidence: 99%

Year 2021: COVID-19, Information Extraction and BERTization among the Hottest Topics in Medical Natural Language Processing

Grabar

Grouin

2022

Yearb Med Inform

View full text Add to dashboard Cite

Objectives: Analyze the content of publications within the medical natural language processing (NLP) domain in 2021. Methods: Automatic and manual preselection of publications to be reviewed, and selection of the best NLP papers of the year. Analysis of the important issues. Results: Four best papers have been selected in 2021. We also propose an analysis of the content of the NLP publications in 2021, all topics included. Conclusions: The main issues addressed in 2021 are related to the investigation of COVID-related questions and to the further adaptation and use of transformer models. Besides, the trends from the past years continue, such as information extraction and use of information from social networks.

show abstract

EHR2CCAS: A framework for mapping EHR to disease knowledge presenting causal chain of disorders – chronic kidney disease example

Cited by 9 publications

References 21 publications

Machine Learning Approaches for Electronic Health Records Phenotyping: A Methodical Review

Machine Learning Approaches for Electronic Health Records Phenotyping: A Methodical Review

Using the Natural Language Processing System Medical Named Entity Recognition-Japanese to Analyze Pharmaceutical Care Records: Natural Language Processing Analysis

Year 2021: COVID-19, Information Extraction and BERTization among the Hottest Topics in Medical Natural Language Processing

Contact Info

Product

Resources

About