MetaMap Lite: an evaluation of a new Java implementation of MetaMap

Demner‐Fushman, Dina; Rogers, Willie J.; Aronson, Alan R.

doi:10.1093/jamia/ocw177

Cited by 140 publications

(97 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This process resulted in a weighted-average precision of 0.62, recall of 0.82, and F 1 score of 0.71 for the CSU data, as compared to a previously reported weighted-average precision of 0.67, recall of 0.53, and F 1 score of 0.58 for human clinical narratives [54]. The results of this evaluation can be seen in Table 3.…”

Section: Evaluation Of Metamap On Veterinary Recordsmentioning

confidence: 79%

“…Evaluation metric . For all models we trained (LSTM, DT, and RF), we used the same evaluation metrics previously reported for MetaMap Lite [54]: a) precision, defined as the proportion of documents which were assigned the correct category; b) recall, defined as the proportion of documents from a given category that were correctly identified; and c) F 1 score, defined as the harmonic average of precision and recall. Formulas for these metrics are provided below: recision P =…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Deep learning facilitates rapid classification of human and veterinary clinical narratives

Pineda

Walk

Venkataraman

et al. 2018

Preprint

View full text Add to dashboard Cite

Objective: Currently, dedicated tagging staff spend considerable effort assigning clinical codes to patient summaries for public health purposes, and machine-learning automated tagging is bottlenecked by availability of electronic medical records. Veterinary medical records, a largely untapped data source that could benefit both human and non-human patients, could fill the gap. Materials and Methods:In this retrospective study, we trained long short-term memory (LSTM) recurrent neural networks (RNNs) on 52,722 human and 89,591 veterinary records. We established relevant baselines by training Decision Trees (DT) and Random Forests (RF) on the same data. We finally investigated the effect of merging data across clinical settings and probed model portability. Results:We show that the LSTM-RNNs accurately classify veterinary/human text narratives into top-level categories with an average weighted macro F 1 score of 0.735/0.675 respectively. The evaluation metric for the LSTM was 7 and 8% higher than that of the DT and RF models respectively. We generally did not find evidence of model portability albeit moderate performance increases in select categories. Discussion:We see a strong positive correlation between number of training samples and classification performance, which is promising for future efforts. The use of LSTM-RNN models represents a scalable structure that could prove useful in cohort selection, which could in turn better address emerging public health concerns. Conclusion:Digitization of human and veterinary health information will continue to be a reality. Our approach is a step forward for these two domains to learn from, and inform, one another.

show abstract

Section: Evaluation Of Metamap On Veterinary Recordsmentioning

confidence: 79%

Section: Discussionmentioning

confidence: 99%

Deep learning facilitates rapid classification of human and veterinary clinical narratives

Pineda

Walk

Venkataraman

et al. 2018

Preprint

View full text Add to dashboard Cite

show abstract

“…To provide additional information about the questions that could be used for diverse IR and NLP tasks, we automatically annotated the questions with the focus, its UMLS Concept Unique Identifier (CUI) and Semantic Type. We combined two methods to recognize named entities from the titles of the crawled articles and their associated UMLS CUIs: (i) exact string matching to the UMLS Metathesaurus 8 , and (ii) MetaMap Lite 9 [16]. We then used the UMLS Semantic Network to retrieve the associated semantic types and groups.…”

Section: Methodsmentioning

confidence: 99%

A question-entailment approach to question answering

2019

Self Cite

View full text Add to dashboard Cite

BackgroundOne of the challenges in large-scale information retrieval (IR) is developing fine-grained and domain-specific methods to answer natural language questions. Despite the availability of numerous sources and datasets for answer retrieval, Question Answering (QA) remains a challenging problem due to the difficulty of the question understanding and answer extraction tasks. One of the promising tracks investigated in QA is mapping new questions to formerly answered questions that are “similar”.ResultsWe propose a novel QA approach based on Recognizing Question Entailment (RQE) and we describe the QA system and resources that we built and evaluated on real medical questions. First, we compare logistic regression and deep learning methods for RQE using different kinds of datasets including textual inference, question similarity, and entailment in both the open and clinical domains. Second, we combine IR models with the best RQE method to select entailed questions and rank the retrieved answers. To study the end-to-end QA approach, we built the MedQuAD collection of 47,457 question-answer pairs from trusted medical sources which we introduce and share in the scope of this paper. Following the evaluation process used in TREC 2017 LiveQA, we find that our approach exceeds the best results of the medical task with a 29.8% increase over the best official score.ConclusionsThe evaluation results support the relevance of question entailment for QA and highlight the effectiveness of combining IR and RQE for future QA efforts. Our findings also show that relying on a restricted set of reliable answer sources can bring a substantial improvement in medical QA.

show abstract

“…[12] A lightweight Java implementation (Metamap Lite) was used in our pipeline due to processing speed and ease of use. In a recent study, MetaMap Lite demonstrated real-time speed and extraction performance comparable to or exceeding those of MetaMap and other popular biomedical text processing tools, [13] clinical Text Analysis and Knowledge Extraction System (cTAKES), [14] and DNorm. [15] Metamap-Lite extracted medical problems, tests, and treatments from 2010 i2b2 concepts dataset with precision 47.0, recall 31.9, and F1 38.0.…”

Section: Umls-driven Natural Language Processing (Nlp) Derived Influementioning

confidence: 99%

“…[15] Metamap-Lite extracted medical problems, tests, and treatments from 2010 i2b2 concepts dataset with precision 47.0, recall 31.9, and F1 38.0. [13] After identifying the UMLS concepts, the NLP pipeline assigned each extracted UMLS Metathesaurus concept an assertion value (present, absent, conditional, hypothetical, possible, not-patient) with an in-house statistical assertion classifier. While building the in-house assertion classifer, the Stanford NLP library [16] was used for tokenization, POS tagging, and dependency parsing to capture a wide range of syntactic and semantic features presented in clinical text.…”

Section: Umls-driven Natural Language Processing (Nlp) Derived Influementioning

confidence: 99%

Leveraging UMLS-driven NLP to enhance identification of influenza predictors derived from electronic medical record data

Stephens

Yetisgen

et al. 2020

Preprint

View full text Add to dashboard Cite

Objective: Multiple clinical prediction rules have been developed, but lack validation. This study aims to identify a set of prediction algorithms for influenza, based on electronic health record (EHR) structured data and clinical notes derived data using Unified Medical Language System (UMLS) driven natural language processing (NLP). Materials and Methods:Data were extracted from an enterprise-wide data warehouse for all patients who tested positive for influenza and were seen in ambulatory care between 2009 and 2019 (N = 7,278). A text processing pipeline was used to analyze chart notes for UMLS terms for symptoms of interest to improve data quality completeness. Three models, which step up complexity of the dataset and predictors, were tested with least absolute shrinkage and selection operator (LASSO)-selected parameters to identify predictors for influenza. Receiver operating characteristic (ROC) curves compared test accuracy across the three models.Results: Three models identified 7, 8, and 10 predictors, and the most complex model performed best. The addition of the UMLS-driven NLP symptoms data improved data quality (false negatives) and increased the number of significant predictors. NLP also increased the strength of the models, as did the addition of two-way predictor interactions. Discussion:The EHR is a feasible source for offering rapidly accessible datasets for influenza related prediction research that was used to produce a prediction model for influenza.Combining data collected in routine care with data science methods improved a prediction model for influenza, and in the future, could be used to drive diagnostics at the point of care.

show abstract

MetaMap Lite: an evaluation of a new Java implementation of MetaMap

Cited by 140 publications

References 15 publications

Deep learning facilitates rapid classification of human and veterinary clinical narratives

Deep learning facilitates rapid classification of human and veterinary clinical narratives

A question-entailment approach to question answering

Leveraging UMLS-driven NLP to enhance identification of influenza predictors derived from electronic medical record data

Contact Info

Product

Resources

About