2021
DOI: 10.1101/2021.09.29.462387
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A systematic evaluation of 41 DNA methylation predictors across 101 data preprocessing and normalization strategies highlights considerable variation in algorithm performance

Abstract: Background DNA methylation (DNAm) based predictors hold great promise to serve as clinical tools for health interventions and disease management. While these algorithms often have high prediction accuracy and are associated with many disease-related phenotypes, the reliability of their performance remains to be determined. We therefore conducted a systematic evaluation across 101 different data processing strategies that preprocess and normalize DNAm data and assessed how each analytical strategy affects the r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 42 publications
0
4
0
Order By: Relevance
“…Another limitation is measurement error. Although the reliability of AgeAccel and the inflammatory markers was not assessed in this study, previous research indicates that epigenetic age variables have high reliability ( 44 ) while inflammatory marker variables are measured with more error (ICCs ~ 0.6) ( 10 ). Such measurement error was likely nondifferential and would likely have introduced bias towards the null, leading to an underestimation of the association between inflammation and epigenetic aging.…”
Section: Discussionmentioning
confidence: 90%
“…Another limitation is measurement error. Although the reliability of AgeAccel and the inflammatory markers was not assessed in this study, previous research indicates that epigenetic age variables have high reliability ( 44 ) while inflammatory marker variables are measured with more error (ICCs ~ 0.6) ( 10 ). Such measurement error was likely nondifferential and would likely have introduced bias towards the null, leading to an underestimation of the association between inflammation and epigenetic aging.…”
Section: Discussionmentioning
confidence: 90%
“…Consequently, there may exist particular sets of CpGs which are essential to the function of for tissue specific aging clocks [49]. However, our results suggest that the majority of the CpGs have significant redundancy even for use in a tissue specific age predictor.…”
Section: Resultsmentioning
confidence: 77%
“…Preprocessing and normalization of methylation data is typically performed within R using the minfi [46], wateRmelon [47], or SeSAMe [48] packages, however methylCIPHER functions regardless of normalization protocol. For more details regarding the effects of choice of normalization, refer to Ori et al [49]. The user must simply have an object of matrix or data frame class, with named columns corresponding to the Illumina CpG names, and cells containing methylation beta ratios between 0 and 1.…”
Section: Methodsmentioning
confidence: 99%
“…BeadChip technology makes large signatures easily applicable since all relevant CpGs are addressed in each sample. However, adaptation and integration of different microarray datasets remains a major hurdle and age-predictors may become outdated if a BeadChip release is discontinued (Ori et al, 2021). It may therefore be advantageous to rather focus on individual CpGs by targeted methods, such as pyrosequencing, digital droplet PCR or barcoded amplicon sequencing (Wagner, 2022).…”
Section: Discussionmentioning
confidence: 99%