Multiple Imputation of Missing Data: A Simulation  Study on a Binary Response

Hardt, Jochen; Herke, Max; Brian, Tamara; Laubach, W.

doi:10.4236/ojs.2013.35043

Cited by 32 publications

(23 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The present findings are consistent with other research that has simulated MCAR and MAR binary data in randomized trials and found minimal bias with MI (Hardt et al 2013) and little difference between MI and CCA (Caille et al, in press, Ma et al 2011). However, these studies did not test MNAR conditions, where MI consistently outperformed CCA in the present study, nor did they test LOCF or WCS approaches.…”

Section: Discussionsupporting

confidence: 93%

Missing Data in Alcohol Clinical Trials with Binary Outcomes

Hallgren

Witkiewitz

Kranzler

et al. 2016

Alcohol Clin Exp Res

View full text Add to dashboard Cite

Background Missing data are common in alcohol clinical trials for both continuous and binary endpoints. Approaches to handle missing data have been explored for continuous outcomes, yet no studies have compared missing data approaches for binary outcomes (e.g., abstinence, no heavy drinking days). The present study compares approaches to modeling binary outcomes with missing data in the COMBINE study. Method We included participants in the COMBINE Study who had complete drinking data during treatment and who were assigned to active medication or placebo conditions (N=1146). Using simulation methods, missing data were introduced under common scenarios with varying sample sizes and amounts of missing data. Logistic regression was used to estimate the effect of naltrexone (vs. placebo) in predicting any drinking and any heavy drinking outcomes at the end of treatment using four analytic approaches: complete case analysis (CCA), last observation carried forward (LOCF), the worst-case scenario of missing equals any drinking or heavy drinking (WCS), and multiple imputation (MI). In separate analyses, these approaches were compared when drinking data were manually deleted for those participants who discontinued treatment but continued to provide drinking data. Results WCS produced the greatest amount of bias in treatment effect estimates. MI usually yielded less biased estimates than WCS and CCA in the simulated data, and performed considerably better than LOCF when estimating treatment effects among individuals who discontinued treatment. Conclusions Missing data can introduce bias in treatment effect estimates in alcohol clinical trials. Researchers should utilize modern missing data methods, including MI, and avoid WCS and CCA when analyzing binary alcohol clinical trial outcomes.

show abstract

Section: Discussionsupporting

confidence: 93%

Missing Data in Alcohol Clinical Trials with Binary Outcomes

Hallgren

Witkiewitz

Kranzler

et al. 2016

Alcohol Clin Exp Res

View full text Add to dashboard Cite

show abstract

“…Even the final test we conducted in the SPECTF heart dataset is not fully exhaustive of what a researcher may encounter in the real-life, as we considered only a MCAR mechanism to create the missing data. The conclusions we draw applies to cases with moderate sizes of missingness, no lower than 15 % and no higher than 30 %; we intentionally limited our evaluations to this range as for small amounts of missing data, under the MAR or MCAR mechanisms, imputation may be useless and for larger amounts caution should always be applied because estimates may become very imprecise [21]. Thus, despite the efficiency of NN imputation under these conditions, it should remembered that imputation should be carefully applied and cannot solve all the problems of incomplete data [22] and that NN imputation can have serious drawbacks as we showed for instance considering the risk of distorting data distribution or the lack of precision in imputing variables with no dependencies in a dataset or, conversely, the possibility to introduce spurious associations considering dependencies where they do not exist.…”

Section: Discussionmentioning

confidence: 99%

Nearest neighbor imputation algorithms: a critical evaluation

Beretta

Santaniello

2016

BMC Med Inform Decis Mak

457

305

View full text Add to dashboard Cite

BackgroundNearest neighbor (NN) imputation algorithms are efficient methods to fill in missing data where each missing value on some records is replaced by a value obtained from related cases in the whole set of records. Besides the capability to substitute the missing data with plausible values that are as close as possible to the true value, imputation algorithms should preserve the original data structure and avoid to distort the distribution of the imputed variable. Despite the efficiency of NN algorithms little is known about the effect of these methods on data structure.MethodsSimulation on synthetic datasets with different patterns and degrees of missingness were conducted to evaluate the performance of NN with one single neighbor (1NN) and with k neighbors without (kNN) or with weighting (wkNN) in the context of different learning frameworks: plain set, reduced set after ReliefF filtering, bagging, random choice of attributes, bagging combined with random choice of attributes (Random-Forest-like method).ResultsWhatever the framework, kNN usually outperformed 1NN in terms of precision of imputation and reduced errors in inferential statistics, 1NN was however the only method capable of preserving the data structure and data were distorted even when small values of k neighbors were considered; distortion was more severe for resampling schemas.ConclusionsThe use of three neighbors in conjunction with ReliefF seems to provide the best trade-off between imputation error and preservation of the data structure. The very same conclusions can be drawn when imputation experiments were conducted on the single proton emission computed tomography (SPECTF) heart dataset after introduction of missing data completely at random.

show abstract

“…We assumed the mechanism leading to missing values to be missing at random (MAR), and therefore integrated multiple imputations into the analyses to minimize bias stemming from missing data. This method was very well suited to this task, regarding the sample size, the number of variables included in the imputation model, and the analyses to be conducted [ 61 , 62 , 63 , 64 ]. The imputation model contained raw data for all variables used in the analysis, before items were combined into scales or dichotomized.…”

Section: Methodsmentioning

confidence: 99%

Health and Well-Being of Adolescents in Different Family Structures in Germany and the Importance of Family Climate

Herke

Knöchelmann

Richter

2020

IJERPH

Self Cite

View full text Add to dashboard Cite

The family is of exceptional and lifelong importance to the health of adolescents. Family structure has been linked to children’s and adolescents’ health and well-being; a nuclear family has been shown to be indicative of better health outcomes as compared with a single-parent family or a step-family. Family climate is rarely included in studies on children’s and adolescents’ health and well-being, albeit findings have indicated it is importance. Using data from n = 6838 students aged 12–13 years from the German National Educational Panel Study, this study shows that stronger familial cohesion and better a parent-child relationship are associated with better self-rated health, higher life satisfaction, more prosocial behavior, and less problematic conduct, and that these associations are stronger than those for family structure. Surveys on young people’s health are encouraged to include family climate above and beyond family structure alone.

show abstract

Multiple Imputation of Missing Data: A Simulation Study on a Binary Response

Cited by 32 publications

References 28 publications

Missing Data in Alcohol Clinical Trials with Binary Outcomes

Missing Data in Alcohol Clinical Trials with Binary Outcomes

Nearest neighbor imputation algorithms: a critical evaluation

Health and Well-Being of Adolescents in Different Family Structures in Germany and the Importance of Family Climate

Contact Info

Product

Resources

About