2016
DOI: 10.1186/s12859-016-1212-5
|View full text |Cite
|
Sign up to set email alerts
|

Risk-conscious correction of batch effects: maximising information extraction from high-throughput genomic datasets

Abstract: BackgroundBatch effects are a persistent and pervasive form of measurement noise which undermine the scientific utility of high-throughput genomic datasets. At their most benign, they reduce the power of statistical tests resulting in actual effects going unidentified. At their worst, they constitute confounds and render datasets useless. Attempting to remove batch effects will result in some of the biologically meaningful component of the measurement (i.e. signal) being lost. We present and benchmark a novel … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
80
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 58 publications
(82 citation statements)
references
References 27 publications
1
80
1
Order By: Relevance
“…Prior to correction for known technical batch effects (sample source and array batch), we ensured that variables of interest (disease outcome, viral escape) had a reasonably balanced distribution among batches and estimated the number of latent variables by sva (Leek and Storey, 2007). Probe set level data were batch effect corrected using the risk-conscious PCA-based adjustment method Harman (Oytam et al, 2016), while controlling for variables of interest. We reran sva to confirm successful batch correction and confirm that no significant latent variables remained in the data.…”
Section: Star Methodsmentioning
confidence: 99%
“…Prior to correction for known technical batch effects (sample source and array batch), we ensured that variables of interest (disease outcome, viral escape) had a reasonably balanced distribution among batches and estimated the number of latent variables by sva (Leek and Storey, 2007). Probe set level data were batch effect corrected using the risk-conscious PCA-based adjustment method Harman (Oytam et al, 2016), while controlling for variables of interest. We reran sva to confirm successful batch correction and confirm that no significant latent variables remained in the data.…”
Section: Star Methodsmentioning
confidence: 99%
“…These batch effects for the arrays at birth were removed using the Harman software package [54]. This method computes, and removes as noise, batch-to-batch variability in the data to the extent that it cannot be accounted for by the observed biological variance with an acceptable probability.…”
Section: Methodsmentioning
confidence: 99%
“…Different normalization methods have their advantages and disadvantages [37][38][39] in removing batch effects, however, they can also become a critical problem in correcting the imbalanced data [32,36]. We examined the performance of several approaches that included ComBat [40] as a part of sva package [41], Quantile Normalization (QN) [42], Remove Unwanted Variation (RUV) [43] and Harman [44]. We identified superior batch effect minimization with ComBat in comparison to other methods and used it to normalize the data.…”
Section: Datasets Collected From Gene Expression Omnibus (Geo) * Dmentioning
confidence: 99%
“…We have examined several batch correction methods: ComBat [40] as a part of sva package [41], Quantile Normalization (QN) [42], Remove Unwanted Variation (RUV) [43] and Harman [44]. Currently, the most common technique to remove the systematic batch effects from biological data is ComBat [40] which based on empirical Bayes method to estimate batch effects and to adjust data across genes.…”
Section: Cross-study Normalization In Data Mergingmentioning
confidence: 99%
See 1 more Smart Citation