2020
DOI: 10.1093/jamia/ocaa242
|View full text |Cite
|
Sign up to set email alerts
|

Informative presence and observation in routine health data: A review of methodology for clinical risk prediction

Abstract: Objective Informative presence (IP) is the phenomenon whereby the presence or absence of patient data is potentially informative with respect to their health condition, with informative observation (IO) being the longitudinal equivalent. These phenomena predominantly exist within routinely collected healthcare data, in which data collection is driven by the clinical requirements of patients and clinicians. The extent to which IP and IO are considered when using such data to develop clinical p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
29
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 27 publications
(29 citation statements)
references
References 56 publications
0
29
0
Order By: Relevance
“…Our study predicts new onset of illness and utilizes 365 days prior observation time to apply a washout window to confirm the absence of the illness and therefore it is possible that our study could suffer from informative presence [ 15 ] and could include data from sicker patients. Therefore, we include results from two sensitivity analyses where the minimum required previous observation was set to 0 days and set to 730 days in order to assess the impact of informative presence.…”
Section: Discussionmentioning
confidence: 99%
“…Our study predicts new onset of illness and utilizes 365 days prior observation time to apply a washout window to confirm the absence of the illness and therefore it is possible that our study could suffer from informative presence [ 15 ] and could include data from sicker patients. Therefore, we include results from two sensitivity analyses where the minimum required previous observation was set to 0 days and set to 730 days in order to assess the impact of informative presence.…”
Section: Discussionmentioning
confidence: 99%
“…While multiple imputation is often used in clinical prediction models because it gives unbiased estimates under the missing at random (MAR) assumption, it is unlikely that the MAR assumption holds in the routinely-collected EHR data that we use [45]. The missing indicator method that we adopt does not rely on the MAR assumption and has been found to lead to improved predictive performance in EHR data [43-45]. Furthermore, we do not seek to make prognostic predictions for patients after clinicians have identified them as entering the last few hours or days of life.…”
Section: Discussionmentioning
confidence: 99%
“…We handle missing data using the missingness indicator approach because the recording in the EHR of a clinical parameter, regardless of the value, is often indicative of the treating health professional’s contemporaneous view of the patient’s prognosis [4344]. To do this we augment the set of potential predictors with binary variables that indicate whether, during the window of time we consider, any measurement of the corresponding parameter is available for that patient.…”
Section: Methodsmentioning
confidence: 99%
“…We created missingness indicators for each predictor with 1 or more missing values, which marked the observations that were missing a value. Inclusion of missingness indicators often improves predictive performance (Agor et al 2019;Sperrin et al 2020), in part because it can reflect the information-seeking behavior of clinicians stemming from medical diagnosis and evaluation (Agniel et al 2018;Groenwold 2020;Sisk et al 2021). The set of missingness indicators was analyzed for perfect collinearity, and duplicate indicators were dropped.…”
Section: Missing Datamentioning
confidence: 99%
“…Additional details are provided in the supplemental information. Multiple imputation was not necessary because our scientific goal was to characterize predictive performance for the unimputed outcome variable, rather than to estimate statistical parameters for covariates that were imputed, such as linear regression coefficients (Sisk et al 2021;Sperrin et al 2020).…”
Section: Missing Datamentioning
confidence: 99%