It is well-known, but frequently overlooked, that low- and high-throughput molecular data may contain batch effects, i.e., systematic technical variation. Confounding of experimental batches with the variable(s) of interest is especially concerning, as a batch effect may then be interpreted as a biologically significant finding. An integral step toward reducing false discovery in molecular data analysis includes inspection for batch effects and accounting for this signal if present. In a 30-sample pilot Illumina Infinium HumanMethylation450 (450k array) experiment, we identified two sources of batch effects: row and chip. Here, we demonstrate two approaches taken to process the 450k data in which an R function, ComBat, was applied to adjust for the non-biological signal. In the “initial analysis,” the application of ComBat to an unbalanced study design resulted in 9,612 and 19,214 significant (FDR < 0.05) DNA methylation differences, despite none present prior to correction. Suspicious of this dramatic change, a “revised processing” included changes to our analysis as well as a greater number of samples, and successfully reduced batch effects without introducing false signal. Our work supports conclusions made by an article previously published in this journal: though the ultimate antidote to batch effects is thoughtful study design, every DNA methylation microarray analysis should inspect, assess and, if necessary, account for batch effects. The analysis experience presented here can serve as a reminder to the broader community to establish research questions a priori, ensure that they match with study design and encourage communication between technicians and analysts.
Background The influence of genetics on variation in DNA methylation (DNAme) is well documented. Yet confounding from population stratification is often unaccounted for in DNAme association studies. Existing approaches to address confounding by population stratification using DNAme data may not generalize to populations or tissues outside those in which they were developed. To aid future placental DNAme studies in assessing population stratification, we developed an ethnicity classifier, PlaNET (Placental DNAme Elastic Net Ethnicity Tool), using five cohorts with Infinium Human Methylation 450k BeadChip array (HM450k) data from placental samples that is also compatible with the newer EPIC platform. Results Data from 509 placental samples were used to develop PlaNET and show that it accurately predicts (accuracy = 0.938, kappa = 0.823) major classes of self-reported ethnicity/race (African: n = 58, Asian: n = 53, Caucasian: n = 389), and produces ethnicity probabilities that are highly correlated with genetic ancestry inferred from genome-wide SNP arrays (> 2.5 million SNP) and ancestry informative markers ( n = 50 SNPs). PlaNET’s ethnicity classification relies on 1860 HM450K microarray sites, and over half of these were linked to nearby genetic polymorphisms ( n = 955). Our placental-optimized method outperforms existing approaches in assessing population stratification in placental samples from individuals of Asian, African, and Caucasian ethnicities. Conclusion PlaNET provides an improved approach to address population stratification in placental DNAme association studies. The method can be applied to predict ethnicity as a discrete or continuous variable and will be especially useful when self-reported ethnicity information is missing and genotyping markers are unavailable. Electronic supplementary material The online version of this article (10.1186/s13072-019-0296-3) contains supplementary material, which is available to authorized users.
Background5,10-Methylenetetrahydrofolate reductase (MTHFR) is a key enzyme in one-carbon metabolism that ensures the availability of methyl groups for methylation reactions. Two single-nucleotide polymorphisms (SNPs) in the MTHFR gene, 677C>T and 1298A>C, result in a thermolabile enzyme with reduced function. These variants, in both the maternal and/or fetal genes, have been associated with pregnancy complications including miscarriage, neural tube defects (NTDs), and preeclampsia (PE), perhaps due to altered capacity for DNA methylation (DNAm). In this study, we assessed the association between MTHFR 677TT and 1298CC genotypes and risk of NTDs, PE, or normotensive intrauterine growth restriction (nIUGR). Additionally, we assessed whether these high-risk genotypes are associated with altered DNAm in the placenta.ResultsIn 303 placentas screened for this study, we observed no significant association between the occurrence of NTDs (N = 55), PE (early-onset: N = 28, late-onset: N = 20), or nIUGR (N = 21) and placental (fetal) MTHFR 677TT or 1298CC genotypes compared to healthy pregnancies (N = 179), though a trend of increased 677TT genotype in PE/IUGR together was observed (OR 2.53, p = 0.048). DNAm was profiled in 10 high-risk 677 (677TT + 1298AA), 10 high-risk 1298 (677CC + 1298CC), and 10 reference (677CC + 1298AA) genotype placentas. Linear modeling identified no significantly differentially methylated sites between high-risk 677 or 1298 and reference placentas at a false discovery rate < 0.05 and Δβ ≥ 0.05 using the Illumina Infinium HumanMethylation450 BeadChip. Using a differentially methylated region analysis or separating cytosine-guanine dinucleotides (CpGs) by CpG density to reduce multiple comparisons also did not identify differential methylation. Additionally, there was no consistent evidence for altered methylation of repetitive DNA between high-risk and reference placentas.ConclusionsWe conclude that large-scale, genome-wide disruption in DNAm does not occur in placentas with the high-risk MTHFR 677TT or 1298CC genotypes. Furthermore, there was no evidence for an association of the 1298CC genotype and only a tendency to higher 677TT in pregnancy complications of PE/IUGR. This may be due to small sample sizes or folate repletion in our Canadian population attenuating effects of the high-risk MTHFR variants. However, given our results and the conflicting results in the literature, investigations into alternative mechanisms that may explain the link between MTHFR variants and pregnancy complications, or in populations at risk of folate deficiencies, are warranted.Electronic supplementary materialThe online version of this article (10.1186/s13148-018-0468-1) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.