2020
DOI: 10.1093/jamia/ocaa104
|View full text |Cite
|
Sign up to set email alerts
|

PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records

Abstract: Objective Developing algorithms to extract phenotypes from electronic health records (EHRs) can be challenging and time-consuming. We developed PheMap, a high-throughput phenotyping approach that leverages multiple independent, online resources to streamline the phenotyping process within EHRs. Materials and Methods PheMap is a knowledge base of medical concepts with quantified relationships to phenotypes that have been extra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
40
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 38 publications
(40 citation statements)
references
References 46 publications
0
40
0
Order By: Relevance
“…We also assess the performance of two commonly used phenotyping methods, PheCode 22 and PheMap. 21 PheCode groups ICD-9/10 codes into clinically meaningful phenotypes, thereby collapsing the diagnosis code space. PheMap is a high-throughput phenotyping approach that identified concepts important to phenotypes from publicly available sources, such as MedlinePlus, MedicineNet, and Wikipedia.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We also assess the performance of two commonly used phenotyping methods, PheCode 22 and PheMap. 21 PheCode groups ICD-9/10 codes into clinically meaningful phenotypes, thereby collapsing the diagnosis code space. PheMap is a high-throughput phenotyping approach that identified concepts important to phenotypes from publicly available sources, such as MedlinePlus, MedicineNet, and Wikipedia.…”
Section: Resultsmentioning
confidence: 99%
“… 8 Previous work in this domain used supervised and unsupervised machine learning to derive phenotypes for several diseases, with different strengths and limitations (see literature review in Note S1). 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 Supervised models rely on classifiers based on manually labeled gold standards for each specific disease, which is time-consuming and not scalable. Unsupervised approaches discover phenotypes purely from the data, trying to aggregate medical concepts commonly appearing together in the patient records.…”
Section: Introductionmentioning
confidence: 99%
“…Automatically coding information in narrative text according to standardized terminologies is a key step in unlocking Electronic Health Record (EHR) documentation for use in health care. Mapping variable descriptions of clinical concepts to welldefined codes-for example, mapping "chronic heart failure" and "chron CHF" to the same ICD-10 code of I50.22-not only improves search and retrieval of medical information from EHRs or published literature (1), but also enables adding evidence from narrative documentation into artificial intelligence-driven predictive analytics and phenotyping (2). Free text is an especially valuable source for information that is not systematically recorded, or difficult to capture in standardized EHR fields, such as social determinants of health (3,4).…”
Section: Introductionmentioning
confidence: 99%
“…This article presents a general-purpose approach to expanding NLP technologies to assign standardized codes to new types of information in the EHR, and applies this approach to produce new technologies for linking EHR text to the ICF. Existing NLP technologies for coding medical information, as well as for linking text to other kinds of controlled inventories such as realworld named entities, largely rely on curated resources such as standardized vocabularies (32,33), expert knowledge graphs (2,34), and/or large-scale data sets with many thousands of samples (35,36). However, such resources have not yet been developed for the functional status domain (7), and are in fact difficult to procure for most under-studied medical concept domains, necessitating the development of new approaches.…”
Section: Introductionmentioning
confidence: 99%
“…However, like ML, NLP-based phenotype definitions are also often associated with complex processes, especially when used to conduct high-throughput phenotyping. For example, a PheMap phenotype definition consists of a set of linked concepts, the presence of which in a patient’s EHR is used to determine the probability of the patient having the condition represented [ 25 ]. The association of a phenotype with different concepts is defined within the PheMap knowledge base, which is constructed on the basis of a process that uses a specific set of NLP tools to derive these associations on the basis of the content of various text-based resources.…”
Section: Desideratamentioning
confidence: 99%