Cardiovascular disease (CVD) is the number one leading cause for human mortality. Besides genetics and environmental factors, in recent years, gut microbiota has emerged as a new factor influencing CVD. Although cause-effect relationships are not clearly established, the reported associations between alterations in gut microbiota and CVD are prominent. Therefore, we hypothesized that machine learning (ML) could be used for gut microbiome–based diagnostic screening of CVD. To test our hypothesis, fecal 16S ribosomal RNA sequencing data of 478 CVD and 473 non-CVD human subjects collected through the American Gut Project were analyzed using 5 supervised ML algorithms including random forest, support vector machine, decision tree, elastic net, and neural networks. Thirty-nine differential bacterial taxa were identified between the CVD and non-CVD groups. ML modeling using these taxonomic features achieved a testing area under the receiver operating characteristic curve (0.0, perfect antidiscrimination; 0.5, random guessing; 1.0, perfect discrimination) of ≈0.58 (random forest and neural networks). Next, the ML models were trained with the top 500 high-variance features of operational taxonomic units, instead of bacterial taxa, and an improved testing area under the receiver operating characteristic curves of ≈0.65 (random forest) was achieved. Further, by limiting the selection to only the top 25 highly contributing operational taxonomic unit features, the area under the receiver operating characteristic curves was further significantly enhanced to ≈0.70. Overall, our study is the first to identify dysbiosis of gut microbiota in CVD patients as a group and apply this knowledge to develop a gut microbiome–based ML approach for diagnostic screening of CVD.
Despite the availability of various diagnostic tests for inflammatory bowel diseases (IBD), misdiagnosis of IBD occurs frequently, and thus there is a clinical need to further improve the diagnosis of IBD. As gut dysbiosis is reported in IBD patients, we hypothesized that supervised machine learning (ML) could be used to analyze gut microbiome data for predictive diagnostics of IBD. To test our hypothesis, fecal 16S metagenomic data of 729 IBD and 700 non-IBD subjects from the American Gut Project were analyzed using five different ML algorithms. Fifty differential bacterial taxa were identified (LEfSe: LDA > 3) between the IBD and non-IBD groups, and ML classifications trained with these taxonomic features using random forest (RF) achieved a testing AUC of ~0.80. Next, we tested if operational taxonomic units (OTUs), instead of bacterial taxa, could be used as ML features for diagnostic classification of IBD. Top 500 high-variance OTUs were used for ML training and an improved testing AUC of ~0.82 (RF) was achieved. Lastly, we tested if supervised ML could be used for differentiating Crohn's disease (CD) and ulcerative colitis (UC). Using 331 CD and 141 UC samples, 117 differential bacterial taxa (LEfSe: LDA > 3) were identified, and the RF model trained with differential taxonomic features or high-variance OTU features achieved a testing AUC > 0.90. In summary, our study demonstrates the promising potential of artificial intelligence via supervised ML modeling for predictive diagnostics of IBD using gut microbiome data.
Dilated cardiomyopathy (DCM) is one of the most common causes of heart failure. Several studies have used RNA-sequencing (RNA-seq) to profile differentially expressed genes (DEGs) associated with DCM. In this study, we aimed to profile gene expression signatures and identify novel genes associated with DCM through a quantitative meta-analysis of three publicly available RNA-seq studies using human left ventricle tissues from 41 DCM cases and 21 control samples. Our meta-analysis identified 789 DEGs including 581 downregulated and 208 upregulated genes. Several DCM-related genes previously reported, including MYH6, CKM, NKX2–5 and ATP2A2, were among the top 50 DEGs. Our meta-analysis also identified 39 new DEGs that were not detected using those individual RNA-seq datasets. Some of those genes, including PTH1R, ADAM15 and S100A4, confirmed previous reports of associations with cardiovascular functions. Using DEGs from this meta-analysis, the Ingenuity Pathway Analysis (IPA) identified five activated toxicity pathways, including failure of heart as the most significant pathway. Among the upstream regulators, SMARCA4 was downregulated and prioritized by IPA as the top affected upstream regulator for several DCM-related genes. To our knowledge, this study is the first to perform a transcriptomic meta-analysis for clinical DCM using RNA-seq datasets. Overall, our meta-analysis successfully identified a core set of genes associated with DCM.
Background Cryopreserved peripheral blood mononuclear cells (PBMCs) are frequently collected and provide disease- and treatment-relevant data in clinical studies. Here, we developed combined protein (40 antibodies) and transcript single-cell (sc)RNA sequencing (scRNA-seq) in PBMCs. Results Among 31 participants in the Women’s Interagency HIV Study (WIHS), we sequenced 41,611 cells. Using Boolean gating followed by Seurat UMAPs (tool for visualizing high-dimensional data) and Louvain clustering, we identified 50 subsets among CD4+ T, CD8+ T, B, NK cells, and monocytes. This resolution was superior to flow cytometry, mass cytometry, or scRNA-seq without antibodies. Combined protein and transcript scRNA-seq allowed for the assessment of disease-related changes in transcriptomes and cell type proportions. As a proof-of-concept, we showed such differences between healthy and matched individuals living with HIV with and without cardiovascular disease. Conclusions In conclusion, combined protein and transcript scRNA sequencing is a suitable and powerful method for clinical investigations using PBMCs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.