The phenotyping of neurological patients involves the conversion of signs and symptoms into machine readable codes selected from an appropriate ontology. The phenotyping of neurological patients is manual and laborious. MetaMap is used for high throughput mapping of the medical literature to concepts in the Unified Medical Language System Metathesaurus (UMLS). MetaMap was evaluated as a tool for the high throughput phenotyping of neurological patients. Based on 15 patient histories from electronic health records, 30 patient histories from neurology textbooks, and 20 clinical summaries from the Online Mendelian Inheritance in Man repository, MetaMap showed a recall of 61-89%, a precision of 84-93%, and an accuracy of 56-84% for the identification of phenotype concepts. The most common cause of false negatives (failure to recognize a phenotype concept) was an inability of MetaMap to find concepts that were represented as a description or a definition of the concept. The most common cause of false positives (incorrect identification of a concept in the text) was a failure to recognize that a concept was negated. MetaMap shows potential for high throughput phenotyping of neurological patients if the problems of false negatives and false positives can be solved.
Disease phenotypes are characterized by signs (what a physician observes during the examination of a patient) and symptoms (the complaints of a patient to a physician). Large repositories of disease phenotypes are accessible through the Online Mendelian Inheritance of Man, Human Phenotype Ontology, and Orphadata initiatives. Many of the diseases in these datasets are neurologic. For each repository, the phenotype of neurologic disease is represented as a list of concepts of variable length where the concepts are selected from a restricted ontology. Visualizations of these concept lists are not provided. We address this limitation by using subsumption to reduce the number of descriptive features from 2,946 classes into thirty superclasses. Phenotype feature lists of variable lengths were converted into fixed-length vectors. Phenotype vectors were aggregated into matrices and visualized as heat maps that allowed side-by-side disease comparisons. Individual diseases (representing a row in the matrix) were visualized as word clouds. We illustrate the utility of this approach by visualizing the neuro-phenotypes of 32 dystonic diseases from Orphadata. Subsumption can collapse phenotype features into superclasses, phenotype lists can be vectorized, and phenotypes vectors can be visualized as heat maps and word clouds.
Disease phenotypes are characterized by signs (what a physician observes during the examination of a patient) and symptoms (the complaints of a patient to a physician). Large repositories of disease phenotypes are accessible through the Online Mendelian Inheritance of Man, Human Phenotype Ontology, and Orphadata initiatives. Many of the diseases in these datasets are neurologic. For each repository, the phenotype of neurologic disease is represented as a variable-length list of concepts selected from a suitable ontology. Visualizations of these lists are not provided. We address this limitation by using subsumption to collapse the number of descriptive features from 2,946 classes into thirty superclasses. Phenotype feature lists of variable lengths were converted into fixed-length numerical vectors. Phenotype vectors can be aggregated into matrices and visualized as heat maps that allow side-by-side disease comparisons. Individual diseases (representing a row in the matrix) can be visualized as word clouds. We illustrate the utility of this approach with a use case based on 32 dystonic diseases in Orphadata. The use of subsumption to collapse phenotype features into superclasses, the conversion of phenotype lists into vectors, and the visualization of phenotypes vectors as heat maps and word clouds contribute to the improved visualization of neurology phenotypes in Orphadata.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.