2023
DOI: 10.1101/2023.01.30.526187
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Uncovering the consequences of batch effect associated missing values in omics data analysis

Abstract: Statistical analyses in high-dimensional omics data are often hampered by the presence of batch effects (BEs) and missing values (MVs), but the interaction between these two issues is not well-studied nor understood. MVs may manifest as a BE when their proportions differ across batches. These are termed as Batch-Effect Associated Missing values (BEAMs). We hypothesized that BEAMs in data may introduce bias which can impede the performance of missing value imputation (MVI). To test this, we simulated data with … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 36 publications
0
1
0
Order By: Relevance
“…microorganisms [27] or rRNA [28]), 3) the use of inappropriate software or statistical approaches (e.g. overlooked covariables and confounders [29], unexpected batch effects [30], variable filtering of noise or low expression [31][32][33], differences in quantification [8], normalization [34], pre-processing [35], differential gene expression [31,36,37], or functional enrichment [38][39][40]), and 4) diverse errors (e.g. insufficient reporting [41], errors in gene naming errors [42,43], complex software [44][45][46], or data handling [47])…”
Section: Introductionmentioning
confidence: 99%
“…microorganisms [27] or rRNA [28]), 3) the use of inappropriate software or statistical approaches (e.g. overlooked covariables and confounders [29], unexpected batch effects [30], variable filtering of noise or low expression [31][32][33], differences in quantification [8], normalization [34], pre-processing [35], differential gene expression [31,36,37], or functional enrichment [38][39][40]), and 4) diverse errors (e.g. insufficient reporting [41], errors in gene naming errors [42,43], complex software [44][45][46], or data handling [47])…”
Section: Introductionmentioning
confidence: 99%