Tree-aggregated predictive modeling of microbiome data

Bien, Jacob; Yan, Xiaohan; Simpson, Léo; Müller, Christian L.

doi:10.1038/s41598-021-93645-3

Cited by 24 publications

(29 citation statements)

References 62 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In its present form, the scCODA framework requires pre-specified cell-type definitions which, in turn, hinge on statistically sound and biologically meaningful clustering assignments. In situations where crisp clustering boundaries are elusive, for instance, due to the presence of the transient developmental processes underlying the data, joint modeling of different resolution hierarchies 26 or modeling compositional processes 27 , 28 may help account for such continuities changes. Furthermore, scCODA assumes a log-linear relationship between covariates and cell abundance, which may be mis-specified in some cases.…”

Section: Discussionmentioning

confidence: 99%

scCODA is a Bayesian model for compositional single-cell data analysis

et al. 2021

View full text Add to dashboard Cite

Compositional changes of cell types are main drivers of biological processes. Their detection through single-cell experiments is difficult due to the compositionality of the data and low sample sizes. We introduce scCODA (https://github.com/theislab/scCODA), a Bayesian model addressing these issues enabling the study of complex cell type effects in disease, and other stimuli. scCODA demonstrated excellent detection performance, while reliably controlling for false discoveries, and identified experimentally verified cell type changes that were missed in original analyses.

show abstract

Section: Discussionmentioning

confidence: 99%

scCODA is a Bayesian model for compositional single-cell data analysis

et al. 2021

View full text Add to dashboard Cite

show abstract

“…We use a variant of the strategy proposed by Bien et al (2021) to make the strength of the regularization penalty dependent on the corresponding node’s position in the tree. We introduce the following sigmoidal scaling:

2

…”

Section: Methodsmentioning

confidence: 99%

“…These methods restrict themselves, however, to fully binary trees. On the other hand, the trac method ( Bien et al, 2021 ) uses tree-guided regularization ( Yan and Bien, 2021 ) in a maximum-likelihood-type framework to predict continuous outcomes from compositional microbiome data.…”

Section: Introductionmentioning

confidence: 99%

tascCODA: Bayesian Tree-Aggregated Analysis of Compositional Amplicon and Single-Cell Data

2021

Self Cite

View full text Add to dashboard Cite

Accurate generative statistical modeling of count data is of critical relevance for the analysis of biological datasets from high-throughput sequencing technologies. Important instances include the modeling of microbiome compositions from amplicon sequencing surveys and the analysis of cell type compositions derived from single-cell RNA sequencing. Microbial and cell type abundance data share remarkably similar statistical features, including their inherent compositionality and a natural hierarchical ordering of the individual components from taxonomic or cell lineage tree information, respectively. To this end, we introduce a Bayesian model for tree-aggregated amplicon and single-cell compositional data analysis (tascCODA) that seamlessly integrates hierarchical information and experimental covariate data into the generative modeling of compositional count data. By combining latent parameters based on the tree structure with spike-and-slab Lasso penalization, tascCODA can determine covariate effects across different levels of the population hierarchy in a data-driven parsimonious way. In the context of differential abundance testing, we validate tascCODA’s excellent performance on a comprehensive set of synthetic benchmark scenarios. Our analyses on human single-cell RNA-seq data from ulcerative colitis patients and amplicon data from patients with irritable bowel syndrome, respectively, identified aggregated cell type and taxon compositional changes that were more predictive and parsimonious than those proposed by other schemes. We posit that tascCODA1 constitutes a valuable addition to the growing statistical toolbox for generative modeling and analysis of compositional changes in microbial or cell population data.

show abstract

“…We use a variant of the strategy proposed by Bien et al (2021) to make the strength of the regularization penalty dependent on the corresponding node's position in the tree. We introduce the following sigmoidal scaling:…”

Section: Node-adaptive Penalizationmentioning

confidence: 99%

“…These methods restrict themselves, however, to fully binary trees. One the other hand, the trac method (Bien et al, 2021)) uses tree-guided regularization (Yan and Bien, 2021)) in a maximum-likelihood-type framework to predict continuous outcomes from compositional microbiome data.…”

Section: Introductionmentioning

confidence: 99%

tascCODA: Bayesian tree-aggregated analysis of compositional amplicon and single-cell data

Ostner

Carcy

Müller

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Accurate generative statistical modeling of count data is of critical relevance for the analysis of biological datasets from high-throughput sequencing technologies. Important instances include the modeling of microbiome compositions from amplicon sequencing surveys and the analysis of cell type compositions derived from single-cell RNA sequencing. Microbial and cell type abundance data share remarkably similar statistical features, including their inherent compositionality and a natural hierarchical ordering of the individual components from taxonomic or cell lineage tree information, respectively. To this end, we introduce a Bayesian model for tree-aggregated amplicon and single-cell compositional data analysis tascCODA that seamlessly integrates hierarchical information and experimental covariate data into the generative modeling of compositional count data. By combining latent parameters based on the tree structure with spike-and-slab Lasso penalization, tascCODA can determine covariate effects across different levels of the population hierarchy in a data-driven parsimonious way. In the context of differential abundance testing, we validate tascCODA's excellent performance on a comprehensive set of synthetic benchmark scenarios. Our analyses on human single-cell RNA-seq data from ulcerative colitis patients and amplicon data from patients with irritable bowel syndrome, respectively, identified aggregated cell type and taxon compositional changes that were more predictive and parsimonious than those proposed by other schemes. We posit that tascCODA constitutes a valuable addition to the growing statistical toolbox for generative modeling and analysis of compositional changes in microbial or cell population data.

show abstract

Tree-aggregated predictive modeling of microbiome data

Cited by 24 publications

References 62 publications

scCODA is a Bayesian model for compositional single-cell data analysis

scCODA is a Bayesian model for compositional single-cell data analysis

tascCODA: Bayesian Tree-Aggregated Analysis of Compositional Amplicon and Single-Cell Data

tascCODA: Bayesian tree-aggregated analysis of compositional amplicon and single-cell data

Contact Info

Product

Resources

About