2020
DOI: 10.1101/2020.09.01.277632
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Tree-Aggregated Predictive Modeling of Microbiome Data

Abstract: Modern high-throughput sequencing technologies provide low-cost microbiome survey data across all habitats of life at unprecedented scale. At the most granular level, the primary data consist of sparse counts of amplicon sequence variants or operational taxonomic units that are associated with taxonomic and phylogenetic group information. In this contribution, we leverage the hierarchical structure of amplicon data and propose a data-driven, parameter-free, and scalable tree-guided aggregation framework to ass… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3
1

Relationship

4
4

Authors

Journals

citations
Cited by 11 publications
(13 citation statements)
references
References 85 publications
0
13
0
Order By: Relevance
“…The c-lasso package also integrates with R via the R package reticulate. We refer to reticulate's manual for technical details about connecting python environments and R. A successful use case of c-lasso is available in the R package trac (Bien et al, 2020), enabling tree-structured aggregation of predictors when features are rare.…”
Section: Calling C-lasso In Rmentioning
confidence: 99%
“…The c-lasso package also integrates with R via the R package reticulate. We refer to reticulate's manual for technical details about connecting python environments and R. A successful use case of c-lasso is available in the R package trac (Bien et al, 2020), enabling tree-structured aggregation of predictors when features are rare.…”
Section: Calling C-lasso In Rmentioning
confidence: 99%
“…We use a variant of the strategy proposed by Bien et al (2021) to make the strength of the regularization penalty dependent on the corresponding node's position in the tree. We introduce the following sigmoidal scaling:…”
Section: Node-adaptive Penalizationmentioning
confidence: 99%
“…These methods restrict themselves, however, to fully binary trees. On the other hand, the trac method (Bien et al, 2021) uses tree-guided regularization (Yan and Bien, 2021) in a maximum-likelihood-type framework to predict continuous outcomes from compositional microbiome data.…”
Section: Introductionmentioning
confidence: 99%
“…We first illustrate our microbiome knockoff generator method without phylogeny tree information on a dataset that was used to investigate the association of dietary and environmental variables with the gut microbiota. As an example of incorporating the phylogenentic structure into microbiome feature selection, we examine the HIV dataset from Bien et al (2021). The outcome of interest is soluble CD14 levels in units of pg/ml, which has been identified as a mortality predictor in HIV infection.…”
Section: Microbiome Features Predictive Of Bmi and Cd14 Levelsmentioning
confidence: 99%