On the stationary distribution of iterative imputations

Liu, Jingchen; Gelman, Andrew; Hill, Jennifer; Su, Yu-Sung; Kropko, Jonathan

doi:10.1093/biomet/ast044

Cited by 93 publications

(146 citation statements)

References 26 publications

Supporting

Mentioning

141

Contrasting

Order By: Relevance

“…The slight bias observed in analysis (a) may be a result of the imputation model being semi-compatible with the analysis model (i.e. the exposure of interest in the analysis model is change in waist circumference, however, waist circumference at wave 2 is imputed in the imputation model) [41]. We decided to impute waist circumference at wave 2 instead of change in waist circumference in order to represent the real epidemiological analysis (i.e.…”

Section: Discussionmentioning

confidence: 99%

The impact of missing data on analyses of a time-dependent exposure in a longitudinal cohort: a simulation study

Karahalios

Baglietto

Lee

et al. 2013

Emerg Themes Epidemiol

View full text Add to dashboard Cite

BackgroundMissing data often cause problems in longitudinal cohort studies with repeated follow-up waves. Research in this area has focussed on analyses with missing data in repeated measures of the outcome, from which participants with missing exposure data are typically excluded. We performed a simulation study to compare complete-case analysis with Multiple imputation (MI) for dealing with missing data in an analysis of the association of waist circumference, measured at two waves, and the risk of colorectal cancer (a completely observed outcome).MethodsWe generated 1,000 datasets of 41,476 individuals with values of waist circumference at waves 1 and 2 and times to the events of colorectal cancer and death to resemble the distributions of the data from the Melbourne Collaborative Cohort Study. Three proportions of missing data (15, 30 and 50%) were imposed on waist circumference at wave 2 using three missing data mechanisms: Missing Completely at Random (MCAR), and a realistic and a more extreme covariate-dependent Missing at Random (MAR) scenarios. We assessed the impact of missing data on two epidemiological analyses: 1) the association between change in waist circumference between waves 1 and 2 and the risk of colorectal cancer, adjusted for waist circumference at wave 1; and 2) the association between waist circumference at wave 2 and the risk of colorectal cancer, not adjusted for waist circumference at wave 1.ResultsWe observed very little bias for complete-case analysis or MI under all missing data scenarios, and the resulting coverage of interval estimates was near the nominal 95% level. MI showed gains in precision when waist circumference was included as a strong auxiliary variable in the imputation model.ConclusionsThis simulation study, based on data from a longitudinal cohort study, demonstrates that there is little gain in performing MI compared to a complete-case analysis in the presence of up to 50% missing data for the exposure of interest when the data are MCAR, or missing dependent on covariates. MI will result in some gain in precision if a strong auxiliary variable that is not in the analysis model is included in the imputation model.

show abstract

Section: Discussionmentioning

confidence: 99%

The impact of missing data on analyses of a time-dependent exposure in a longitudinal cohort: a simulation study

Karahalios

Baglietto

Lee

et al. 2013

Emerg Themes Epidemiol

View full text Add to dashboard Cite

show abstract

“…Otherwise, FCS is less theoretically justified, but there is much evidence that it works well in terms of approximate unbiasedness of parameter and variance estimates and coverage of confidence intervals (van Buuren, 2012;Hughes et al, 2014;Lee and Carlin, 2010). An important theoretical result was given by Liu et al (2014). They defined the set of conditional models to be compatible with a joint model if, for each conditional model and every possible set of parameter values for that model, there exists a set of parameter values for the joint model such that the conditional and joint models imply the same distribution for the dependent variable of that conditional model.…”

Section: Joint Model MI and Full-conditional Specification (Fcs) Mimentioning

confidence: 99%

Handling Missing Data in Matched Case-Control Studies Using Multiple Imputation

Seaman

Keogh

2015

Biometrics

View full text Add to dashboard Cite

Summary. Analysis of matched case-control studies is often complicated by missing data on covariates. Analysis can be restricted to individuals with complete data, but this is inefficient and may be biased. Multiple imputation (MI) is an efficient and flexible alternative. We describe two MI approaches. The first uses a model for the data on an individual and includes matching variables; the second uses a model for the data on a whole matched set and avoids the need to model the matching variables. Within each approach, we consider three methods: full-conditional specification (FCS), joint model MI using a normal model, and joint model MI using a latent normal model. We show that FCS MI is asymptotically equivalent to joint model MI using a restricted general location model that is compatible with the conditional logistic regression analysis model. The normal and latent normal imputation models are not compatible with this analysis model. All methods allow for multiple partially-observed covariates, non-monotone missingness, and multiple controls per case. They can be easily applied in standard statistical software and valid variance estimates obtained using Rubin's Rules. We compare the methods in a simulation study. The approach of including the matching variables is most efficient. Within each approach, the FCS MI method generally yields the least-biased odds ratio estimates, but normal or latent normal joint model MI is sometimes more efficient. All methods have good confidence interval coverage. Data on colorectal cancer and fibre intake from the EPIC-Norfolk study are used to illustrate the methods, in particular showing how efficiency is gained relative to just using individuals with complete data.

show abstract

“…Therefore, the estimated population mean of the outcome is

Y^{* *} = \frac{\sum_{i} Y_{i} W_{i}^{* *}}{\sum_{i} W_{i}^{* *}}

. The effect of augmenting the social determinant (income, education or employment) on variable X I is then the difference between Y** and Y*, that is, E = Y** − Y*. The missing data are imputed by the method of multiple imputation via chain equations through the mi package in R. 3 In particular, we create 10 imputations. The variances of the complete data estimators are estimated by the Bootstrap method described in Rust and Rao (1996).…”

Section: Results Of Simulation Amentioning

confidence: 99%

“…The missing data are imputed by the method of multiple imputation via chain equations through the mi package in R. 3 In particular, we create 10 imputations. The variances of the complete data estimators are estimated by the Bootstrap method described in Rust and Rao (1996).…”

Section: Results Of Simulation Amentioning

confidence: 99%

Simulations Test Impact Of Education, Employment, And Income Improvements On Minority Patients With Mental Illness

et al. 2017

View full text Add to dashboard Cite

Social determinants of health, such as poverty and minority background, severely disadvantage many people with mental disorders. A variety of innovative federal, state, and local programs have combined social services with mental health interventions. To explore the potential effects of such supports for addressing poverty and disadvantage on mental health outcomes, we simulated improvements in three social determinants—education, employment, and income. We used two large data sets: one from the National Institute of Mental Health that contained information about people with common mental disorders such as anxiety and depression, and another from the Social Security Administration that contained information about people who were disabled due to severe mental disorders such as schizophrenia and bipolar disorder. Our simulations showed that increasing employment was significantly correlated with improvements in mental health outcomes, while increasing education and income produced weak or nonsignificant correlations. In general, minority groups as well as the majority group of non-Latino whites improved in the desired outcomes. We recommend that health policy leaders, state and federal agencies, and insurers provide evidence-based employment services as a standard treatment for people with mental disorders.

show abstract

On the stationary distribution of iterative imputations

Cited by 93 publications

References 26 publications

The impact of missing data on analyses of a time-dependent exposure in a longitudinal cohort: a simulation study

The impact of missing data on analyses of a time-dependent exposure in a longitudinal cohort: a simulation study

Handling Missing Data in Matched Case-Control Studies Using Multiple Imputation

Simulations Test Impact Of Education, Employment, And Income Improvements On Minority Patients With Mental Illness

Contact Info

Product

Resources

About