Positional effects revealed in Illumina Methylation Array and the impact on analysisWith the evolution of rapid epigenetic research, Illumina Infinium HumanMethylation BeadChips have been widely used to study DNA methylation. However, in evaluating the accuracy of this method, we found that the commonly used Illumina HumanMethylation BeadChips are substantially affected by positional effects; the DNA sample's location in a chip affects the measured methylation levels. We analyzed three HumanMethylation450 and three HumanMethylation27 datasets by using four methods to prove the existence of positional effects. Three datasets were analyzed further for technical replicate analysis or differential methylation CpG sites analysis. The pre-and postcorrection comparisons indicate that the positional effects could alter the measured methylation values and downstream analysis results. Nevertheless, ComBat, linear regression and functional normalization could all be used to minimize such artifact. We recommend performing ComBat to correct positional effects followed by the correction of batch effects in data preprocessing as this procedure slightly outperforms the others. In addition, randomizing the sample placement should be a critical laboratory practice for using such experimental platforms. Code for our method is freely available at:https://github.com/ChuanJ/posibatch. . DNA methylation has also been implicated in the development of cancer 4-6 and other diseases [7][8][9] . Furthermore, several studies indicated that the DNA methylation levels could vary by age 10 , sex
11, disease affected status [4][5][6][7][8][9] , circadian rhythms 12 , tissues types 13 and other factors.Many methods have been used to measure the methylation levels of cytosines, such as blotting, atomic force spectroscopy, genomic sequencing, bisulfite sequencing, peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission.The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/153858 doi: bioRxiv preprint first posted online . It has twelve sample sections in one array arranged in a six by two format (Fig.S1). While Methyl27 measures the methylation status of over 27,000 CpG sites in the genome using the Type I assay with twelve sample locations arranged by twelve rows (Fig.S1) peer-reviewed) is the author/funder. All rights reserved. No reuse allowed without permission.The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/153858 doi: bioRxiv preprint first posted online In this article, we compared three methods for correcting the positional effects:ComBat, linear regression model and functional normalization. ComBat adjusts for known batches using an empirical Bayesian method even in small sample sizes, the linear regression model is a classical method to remove known confounders, and FN is an unsupervised method using control probes as surrogates for unwanted variation Few studies had properly addressed the positional effects, which could lead to p...