Multiple imputation by chained equations is a flexible and practical approach to handling missing data. We describe the principles of the method and show how to impute categorical and quantitative variables, including skewed variables. We give guidance on how to specify the imputation model and how many imputations are needed. We describe the practical analysis of multiply imputed data, including model building and model checking. We stress the limitations of the method and discuss the possible pitfalls. We illustrate the ideas using a data set in mental health, giving Stata code fragments.
Most studies have some missing data. Jonathan Sterne and colleagues describe the appropriate use and reporting of the multiple imputation approach to dealing with them
Although plasma proteins have important roles in biological processes and are the direct targets of many drugs, the genetic factors that control inter-individual variation in plasma protein levels are not well understood. Here we characterize the genetic architecture of the human plasma proteome in healthy blood donors from the INTERVAL study. We identify 1,927 genetic associations with 1,478 proteins, a fourfold increase on existing knowledge, including trans associations for 1,104 proteins. To understand the consequences of perturbations in plasma protein levels, we apply an integrated approach that links genetic variation with biological pathway, disease, and drug databases. We show that protein quantitative trait loci overlap with gene expression quantitative trait loci, as well as with disease-associated loci, and find evidence that protein biomarkers have causal roles in disease using Mendelian randomization analysis. By linking genetic factors to diseases via specific proteins, our analyses highlight potential therapeutic targets, opportunities for matching existing drugs with new disease indications, and potential safety concerns for drugs under development.
Context Associations of major lipids and apolipoproteins with the risk of vascular disease have not been reliably quantified. Objective To assess major lipids and apolipoproteins in vascular risk. Design, Setting, and Participants Individual records were supplied on 302 430 people without initial vascular disease from 68 long-term prospective studies, mostly in Europe and North America. During 2.79 million person-years of follow-up, there were 8857 nonfatal myocardial infarctions, 3928 coronary heart disease [CHD] deaths, 2534 ischemic strokes, 513 hemorrhagic strokes, and 2536 unclassified strokes. Main Outcome Measures Hazard ratios (HRs), adjusted for several conventional factors, were calculated for 1-SD higher values: 0.52 loge triglyceride, 15 mg/dL high-density lipoprotein cholesterol (HDL-C), 43 mg/dL non-HDL-C, 29 mg/dL apolipoprotein AI, 29 mg/dL apolipoprotein B, and 33 mg/dL directly measured low-density lipoprotein cholesterol (LDL-C). Within-study regression analyses were adjusted for within-person variation and combined using meta-analysis. Results The rates of CHD per 1000 person-years in the bottom and top thirds of baseline lipid distributions, respectively, were 2.6 and 6.2 with triglyceride, 6.4 and 2.4 with HDL-C, and 2.3 and 6.7 with non-HDL-C. Adjusted HRs for CHD were 0.99 (95% CI, 0.94-1.05) with triglyceride, 0.78 (95% CI, 0.74-0.82) with HDL-C, and 1.50 (95% CI, 1.39-1.61) with non-HDL-C. Hazard ratios were at least as strong in participants who did not fast as in those who did. The HR for CHD was 0.35 (95% CI, 0.30-0.42) with a combination of 80 mg/dL lower non-HDL-C and 15 mg/dL higher HDL-C. For the subset with apolipoproteins or directly measured LDL-C, HRs were 1.50 (95% CI, 1.38-1.62) with the ratio non-HDL-C/HDL-C, 1.49 (95% CI, 1.39-1.60) with the ratio apo B/apo AI, 1.42 (95% CI, 1.06-1.91) with non-HDL-C, and 1.38 (95% CI, 1.09-1.73) with directly measured LDL-C. Hazard ratios for ischemic stroke were 1.02 (95% CI, 0.94-1.11) with triglyceride, 0.93 (95% CI, 0.84-1.02) with HDL-C, and 1.12 (95% CI, 1.04-1.20) with non-HDL-C. Conclusion Lipid assessment in vascular disease can be simplified by measurement of either total and HDL cholesterol levels or apolipoproteins without the need to fast and without regard to triglyceride.
SummaryBackgroundLow-risk limits recommended for alcohol consumption vary substantially across different national guidelines. To define thresholds associated with lowest risk for all-cause mortality and cardiovascular disease, we studied individual-participant data from 599 912 current drinkers without previous cardiovascular disease.MethodsWe did a combined analysis of individual-participant data from three large-scale data sources in 19 high-income countries (the Emerging Risk Factors Collaboration, EPIC-CVD, and the UK Biobank). We characterised dose–response associations and calculated hazard ratios (HRs) per 100 g per week of alcohol (12·5 units per week) across 83 prospective studies, adjusting at least for study or centre, age, sex, smoking, and diabetes. To be eligible for the analysis, participants had to have information recorded about their alcohol consumption amount and status (ie, non-drinker vs current drinker), plus age, sex, history of diabetes and smoking status, at least 1 year of follow-up after baseline, and no baseline history of cardiovascular disease. The main analyses focused on current drinkers, whose baseline alcohol consumption was categorised into eight predefined groups according to the amount in grams consumed per week. We assessed alcohol consumption in relation to all-cause mortality, total cardiovascular disease, and several cardiovascular disease subtypes. We corrected HRs for estimated long-term variability in alcohol consumption using 152 640 serial alcohol assessments obtained some years apart (median interval 5·6 years [5th–95th percentile 1·04–13·5]) from 71 011 participants from 37 studies.FindingsIn the 599 912 current drinkers included in the analysis, we recorded 40 310 deaths and 39 018 incident cardiovascular disease events during 5·4 million person-years of follow-up. For all-cause mortality, we recorded a positive and curvilinear association with the level of alcohol consumption, with the minimum mortality risk around or below 100 g per week. Alcohol consumption was roughly linearly associated with a higher risk of stroke (HR per 100 g per week higher consumption 1·14, 95% CI, 1·10–1·17), coronary disease excluding myocardial infarction (1·06, 1·00–1·11), heart failure (1·09, 1·03–1·15), fatal hypertensive disease (1·24, 1·15–1·33); and fatal aortic aneurysm (1·15, 1·03–1·28). By contrast, increased alcohol consumption was log-linearly associated with a lower risk of myocardial infarction (HR 0·94, 0·91–0·97). In comparison to those who reported drinking >0–≤100 g per week, those who reported drinking >100–≤200 g per week, >200–≤350 g per week, or >350 g per week had lower life expectancy at age 40 years of approximately 6 months, 1–2 years, or 4–5 years, respectively.InterpretationIn current drinkers of alcohol in high-income countries, the threshold for lowest risk of all-cause mortality was about 100 g/week. For cardiovascular disease subtypes other than myocardial infarction, there were no clear risk thresholds below which lower alcohol consumption stopped being ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.