2019
DOI: 10.1093/jamia/ocz170
|View full text |Cite
|
Sign up to set email alerts
|

A maximum likelihood approach to electronic health record phenotyping using positive and unlabeled patients

Abstract: Objective Phenotyping patients using electronic health record (EHR) data conventionally requires labeled cases and controls. Assigning labels requires manual medical chart review and therefore is labor intensive. For some phenotypes, identifying gold-standard controls is prohibitive. We developed an accurate EHR phenotyping approach that does not require labeled controls. Materials and Methods Our framework relies on a random… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 16 publications
(12 citation statements)
references
References 27 publications
0
12
0
Order By: Relevance
“…There are several possible directions for further improving FEAT. For one, the ability of FEAT to recapitulate expert-curated heuristics suggests that simpler expert heuristics, such as anchor variables 49 , may be leveraged as teachers in a semi-supervised approach. This could be implemented with multi-stage learning, first to predict heuristics and then to predict chart-review.…”
Section: Resultsmentioning
confidence: 99%
“…There are several possible directions for further improving FEAT. For one, the ability of FEAT to recapitulate expert-curated heuristics suggests that simpler expert heuristics, such as anchor variables 49 , may be leveraged as teachers in a semi-supervised approach. This could be implemented with multi-stage learning, first to predict heuristics and then to predict chart-review.…”
Section: Resultsmentioning
confidence: 99%
“… Assign probability of known disease; [30] evaluate data driven selection of cases or controls such as a maximum likelihood approach. [31] Defined Logic Data use requires knowledge of data cleaning processes. Build a data dictionary documenting representation of data elements (e.g., Boolean, temporal) as well as cleaning methods.…”
Section: Discussionmentioning
confidence: 99%
“…The first approaches transfer real patient data into synthetic data by learning the patterns of disease and care in real background EHRs. The EMERGE framework [7] and machine learning-based approaches such as medGAN [8] or statistical approaches [9] fall into this category. While these algorithms require a comparatively minor amount of manual input from subject matter experts, they will inevitably make severe errors when capturing the disease progression and careflow models, compromising the realism of the output data [7,8].…”
Section: Prior Workmentioning
confidence: 99%