2017
DOI: 10.1093/jamia/ocx071
|View full text |Cite
|
Sign up to set email alerts
|

Biases introduced by filtering electronic health records for patients with “complete data”

Abstract: As investigators design studies, they need to balance their confidence in the completeness of the data with the effects of placing requirements on the data on the resulting patient cohort.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
52
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 72 publications
(53 citation statements)
references
References 20 publications
1
52
0
Order By: Relevance
“…[1114, 1719] It is known that invoking certain data sufficiency requirements selects for a sicker patient population, since these patients have more frequent data collection[14, 17]. In our cohort, shorter time between dates of lipid and BP measurements was associated with a greater number of comorbidities and with poorer BP and cholesterol values.…”
Section: Discussionmentioning
confidence: 94%
See 3 more Smart Citations
“…[1114, 1719] It is known that invoking certain data sufficiency requirements selects for a sicker patient population, since these patients have more frequent data collection[14, 17]. In our cohort, shorter time between dates of lipid and BP measurements was associated with a greater number of comorbidities and with poorer BP and cholesterol values.…”
Section: Discussionmentioning
confidence: 94%
“…In the analytic phase of research, investigators often limit analyses only to patients with a complete set of data. [14, 31] For example, the eMERGE network has defined completeness as having one biobanked sample, at least two clinical visits, and data from each of several data categories. [36] Our results and those of others suggest that such data sufficiency requirements might introduce bias that should be examined before investigators choose this strategy over other methods to account for missing data,[14] including multiple imputation.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…To tune hyperparameters if any in these models, we mask out 20% observed measurements in the training set as a validation set and tune hyperparameters on the validation set. We introduce π (1) , π (2) and π (3) as hyperparameters of our imputation model. In our experiments, all mixture models start with the same mixing weights, i.e., π (k) v,b = π (k) for k = 1, 2, 3.…”
Section: Evaluation Of Imputation Qualitymentioning
confidence: 99%