2021
DOI: 10.1038/s41598-021-93645-3
|View full text |Cite
|
Sign up to set email alerts
|

Tree-aggregated predictive modeling of microbiome data

Abstract: Modern high-throughput sequencing technologies provide low-cost microbiome survey data across all habitats of life at unprecedented scale. At the most granular level, the primary data consist of sparse counts of amplicon sequence variants or operational taxonomic units that are associated with taxonomic and phylogenetic group information. In this contribution, we leverage the hierarchical structure of amplicon data and propose a data-driven and scalable tree-guided aggregation framework to associate microbial … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
27
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 24 publications
(29 citation statements)
references
References 62 publications
1
27
0
Order By: Relevance
“…In its present form, the scCODA framework requires pre-specified cell-type definitions which, in turn, hinge on statistically sound and biologically meaningful clustering assignments. In situations where crisp clustering boundaries are elusive, for instance, due to the presence of the transient developmental processes underlying the data, joint modeling of different resolution hierarchies 26 or modeling compositional processes 27 , 28 may help account for such continuities changes. Furthermore, scCODA assumes a log-linear relationship between covariates and cell abundance, which may be mis-specified in some cases.…”
Section: Discussionmentioning
confidence: 99%
“…In its present form, the scCODA framework requires pre-specified cell-type definitions which, in turn, hinge on statistically sound and biologically meaningful clustering assignments. In situations where crisp clustering boundaries are elusive, for instance, due to the presence of the transient developmental processes underlying the data, joint modeling of different resolution hierarchies 26 or modeling compositional processes 27 , 28 may help account for such continuities changes. Furthermore, scCODA assumes a log-linear relationship between covariates and cell abundance, which may be mis-specified in some cases.…”
Section: Discussionmentioning
confidence: 99%
“…We use a variant of the strategy proposed by Bien et al (2021) to make the strength of the regularization penalty dependent on the corresponding node’s position in the tree. We introduce the following sigmoidal scaling: 2 …”
Section: Methodsmentioning
confidence: 99%
“…These methods restrict themselves, however, to fully binary trees. On the other hand, the trac method ( Bien et al, 2021 ) uses tree-guided regularization ( Yan and Bien, 2021 ) in a maximum-likelihood-type framework to predict continuous outcomes from compositional microbiome data.…”
Section: Introductionmentioning
confidence: 99%
“…We use a variant of the strategy proposed by Bien et al (2021) to make the strength of the regularization penalty dependent on the corresponding node's position in the tree. We introduce the following sigmoidal scaling:…”
Section: Node-adaptive Penalizationmentioning
confidence: 99%
“…These methods restrict themselves, however, to fully binary trees. One the other hand, the trac method (Bien et al, 2021)) uses tree-guided regularization (Yan and Bien, 2021)) in a maximum-likelihood-type framework to predict continuous outcomes from compositional microbiome data.…”
Section: Introductionmentioning
confidence: 99%