Gut Microbial Shifts Indicate Melanoma Presence and Bacterial Interactions in a Murine Model

Rossi, Marco; Aspromonte, Salvatore M.; Kohlhapp, Frederick J.; Newman, Jenna H.; Lemenze, Alex; Pepe, Russell J.; DeFina, Samuel M.; Herzog, Nora L.; Donnelly, Robert; Kuzel, Timothy M.; Reiser, Jochen; Guevara-Patiño, José A.; Zloza, Andrew

doi:10.3390/diagnostics12040958

Cited by 1 publication

(7 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Unlike Decision Trees, Random Forest classifiers are highly popular when studying the association between cancer and the microbiome. This is supported by several published benchmarking experiments in which this model outperformed all other tested algorithms in tasks including identifying colorectal cancer 85 , melanomas in mice 87 , cancer subtypes 69 , and other host traits 41 . Random Forests have been applied to predicting the survival time of colorectal cancer patients from gene expression and microbiome taxonomic profiles 90 and identifying several tumor types such as epithelial ovarian cancer 44 , tonsillar squamous cell carcinoma 58 , lung adenocarcinoma 33 , colorectal cancer 28 , oral squamous cell carcinoma 98 , and in a multiclass classification setting 69 .…”

Section: Decision Tree-based Modelssupporting

confidence: 57%

“…In this same publication, a boosting approach and a Random Forest demonstrated comparable performance 85 , while in ref. 87 a boosting model showed a decreased AUC but improved precision and recall over Random Forests. Boosting methods have also been proposed to predict tissue malignancy in breast cancer using bacterial taxonomic profiles from biopsies 30 and to identify several tumor subtypes from microbiome data 18 .…”

Section: Decision Tree-based Modelsmentioning

confidence: 96%

“…Feature extension can be done without increasing the computational load with the so-called kernel trick 86 . The most used kernel in tumor-associated microbiome analysis is the radial basis function (RBF) 85,87 , which has achieved good results in colorectal cancer prediction 48,71 . However, when kernels are used, the prediction for a sample is based on its kernel distance to the support vectors 88 .…”

Section: Support Vector Machinesmentioning

confidence: 99%

“…SVMs have been used to predict colorectal patient survival time 89,90 , cancer prognosis, and drug responses 78 from microbiome and gene expression data, and to identify colorectal cancer patients using taxonomic profiles 71,85 . Furthermore, SVMs are frequently used as benchmarks when evaluating other methods 47,87,91 . The popularity of SVMs for cancer-related host trait prediction from the microbiome is due to their widespread availability in ML libraries and the good performance demonstrated by these models in multiple works.…”

Section: Support Vector Machinesmentioning

confidence: 99%

“…85, L2 regularized Logistic Regression achieved a comparable AUC to Random Forest and Boosted Trees for colorectal cancer identification. However, this is seldom observed 49,87,91 ; as examples, Logistic Regression failed to improve colorectal polyp identification over a Multilayer Perceptron and Naïve Bayes classifier 92 , and colorectal cancer identification over a Multimodal Neural Network 47 . Because of this decreased performance, Logistic Regression has mainly been used for feature selection.…”

Section: Logistic Regressionmentioning

confidence: 99%

See 4 more Smart Citations

A review of machine learning methods for cancer characterization from microbiome data

Teixeira,

Silva,

Ferreira

et al. 2024

npj Precis. Onc.

View full text Add to dashboard Cite

Recent studies have shown that the microbiome can impact cancer development, progression, and response to therapies suggesting microbiome-based approaches for cancer characterization. As cancer-related signatures are complex and implicate many taxa, their discovery often requires Machine Learning approaches. This review discusses Machine Learning methods for cancer characterization from microbiome data. It focuses on the implications of choices undertaken during sample collection, feature selection and pre-processing. It also discusses ML model selection, guiding how to choose an ML model, and model validation. Finally, it enumerates current limitations and how these may be surpassed. Proposed methods, often based on Random Forests, show promising results, however insufficient for widespread clinical usage. Studies often report conflicting results mainly due to ML models with poor generalizability. We expect that evaluating models with expanded, hold-out datasets, removing technical artifacts, exploring representations of the microbiome other than taxonomical profiles, leveraging advances in deep learning, and developing ML models better adapted to the characteristics of microbiome data will improve the performance and generalizability of models and enable their usage in the clinic.

show abstract