2018
DOI: 10.1111/2041-210x.13115
|View full text |Cite
|
Sign up to set email alerts
|

Methods for normalizing microbiome data: An ecological perspective

Abstract: 1. Microbiome sequencing data often need to be normalized due to differences in read depths, and recommendations for microbiome analyses generally warn against using proportions or rarefying to normalize data and instead advocate alternatives, such as upper quartile, CSS, edgeR-TMM, or DESeq-VS. Those recommendations are, however, based on studies that focused on differential abundance testing and variance standardization, rather than community-level comparisons (i.e., beta diversity). Also, standardizing the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
238
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 277 publications
(238 citation statements)
references
References 27 publications
0
238
0
Order By: Relevance
“…After processing, we had 1 433 721 16S and 3 192 634 fungal ITS reads, representing 13 592 bacterial and 5431 fungal OTUs, respectively. We removed samples with <1000 reads, which resulted in omitting two bacterial samples from the 2017 season (both from the 5 th year of succession), and normalised the samples using two different methods recommended for community comparison: rarefying (to 1000 sequences) and proportions (total sums scaling) (McKnight et al , ). For comparison, we additionally normalised the community data using cumulative sums scaling via the package metagenome S eq (Paulson et al , ).…”
Section: Methodsmentioning
confidence: 99%
“…After processing, we had 1 433 721 16S and 3 192 634 fungal ITS reads, representing 13 592 bacterial and 5431 fungal OTUs, respectively. We removed samples with <1000 reads, which resulted in omitting two bacterial samples from the 2017 season (both from the 5 th year of succession), and normalised the samples using two different methods recommended for community comparison: rarefying (to 1000 sequences) and proportions (total sums scaling) (McKnight et al , ). For comparison, we additionally normalised the community data using cumulative sums scaling via the package metagenome S eq (Paulson et al , ).…”
Section: Methodsmentioning
confidence: 99%
“…Throughout this study, we calculated all BC by transforming the data to proportions(McKnight et al, 2018) and using the vegan package in R(Oksanen et al, 2017). For each iteration, the input parameters were randomly selected from the following values: number of OTUs that were entirely contamination = 0-150, number of OTUs not in the blank = 50-1000, and number of overlapping OTUs = 0-150 (OTUs were randomly sampled from a supplied distribution, resulting in varying amounts of DNA per OTU).…”
mentioning
confidence: 99%
“…Finally, the number of sequencing reads for the blank and the sample were independently selected from a range of 18,000-20,000.For each iteration, we calculated Bray-Curtis dissimilarities (BC) between the uncontaminated versus contaminated sample and uncontaminated versus decontaminated sample and used those dissimilarities to judge the effectiveness of microDecon. Throughout this study, we calculated all BC by transforming the data to proportions(McKnight et al, 2018) and using the vegan package in R(Oksanen et al, 2017). Additionally, we applied multiple linear regression to the results to see how different factors influenced the effectiveness of microDecon (results are presented in Appendix S3).Finally, we ran 10,000 iterations of a slightly modified version of simulation 1 that tested the effects of simply removing contaminant OTUs (i.e., all contaminant OTUs were set to zero in the final sample).…”
mentioning
confidence: 99%
“…In our view, if even a single feature shifts in relative abundance among groups, then this demonstrates an effect of sampling group that could be biologically interesting, albeit subtle. Such effects will go unnoticed if analyses rely on techniques such as ordination and PERMANOVA, which can provide insight into overall differences between sampling groups (McKnight et al, ), but provide no statistical model to identify those features that may differ in relative abundance among groups. Accordingly, a variety of methods have been developed to perform the seemingly simple task of determining treatment‐induced shifts in relative abundance, which is often referred to as ‘differential relative abundance testing’ or ‘differential expression’ testing (the latter phrase arises because the roots of many of these methods lie within the field of functional genomics; Bullard, Purdom, Hansen, & Dudoit, ; Dillies et al, ; Paulson, Stine, Bravo, & Pop, ; Thorsen et al, ; Weiss et al, ).…”
Section: Introductionmentioning
confidence: 99%
“…Early approaches typically relied on repeated frequentist tests after transforming count data to account for differences in sampling effort among replicates or sampling groups, typically via rarefaction, conversion to proportions, or, for transcriptomic data, reads per kilobase per million mapped reads (Bullard et al, ). More recently, rarefaction has been criticized because it can amplify the variation present within replicates and thus reduce statistical power (McMurdie & Holmes, ; but see McKnight et al, and Weiss et al, for counterarguments). Numerous statistical modelling approaches have arisen to account for the challenges imposed by compositional data, while avoiding rarefaction.…”
Section: Introductionmentioning
confidence: 99%