Patients with inflammatory bowel disease (IBD) wait months and undergo numerous invasive procedures between the initial appearance of symptoms and receiving a diagnosis. In order to reduce time until diagnosis and improve patient wellbeing, machine learning algorithms capable of diagnosing IBD from the gut microbiome’s composition are currently being explored. To date, these models have had limited clinical application due to decreased performance when applied to a new cohort of patient samples. Various methods have been developed to analyze microbiome data which may improve the generalizability of machine learning IBD diagnostic tests. With an abundance of methods, there is a need to benchmark the performance and generalizability of various machine learning pipelines (from data processing to training a machine learning model) for microbiome-based IBD diagnostic tools. We collected fifteen 16S rRNA microbiome datasets (7,707 samples) from North America to benchmark combinations of gut microbiome features, data normalization and transformation methods, batch effect correction methods, and machine learning models. Pipeline generalizability to new cohorts of patients was evaluated with two binary classification metrics following leave-one-dataset-out cross (LODO) validation, where all samples from one study were left out of the training set and tested upon. We demonstrate that taxonomic features processed with a compositional transformation method and batch effect correction with the naive zero-centering method attain the best classification performance. In addition, machine learning models that identify non-linear decision boundaries between labels are more generalizable than those that are linearly constrained. Lastly, we illustrate the importance of generating a curated training dataset to ensure similar performance across patient demographics. These findings will help improve the generalizability of machine learning models as we move towards non-invasive diagnostic and disease management tools for patients with IBD.
BackgroundInflammatory bowel disease (IBD) patients wait months and undergo numerous invasive procedures between the initial appearance of symptoms and receiving a diagnosis. In order to reduce time until diagnosis and improve patient wellbeing, machine learning algorithms capable of diagnosing IBD from the gut microbiome’s composition are currently being explored. To date, these models have had limited clinical application due to decreased performance when applied to a new cohort of patient samples. Various methods have been developed to analyze microbiome data which may improve the generalizability of machine learning IBD diagnostic tests. With an abundance of methods, there is a need to benchmark the performance and generalizability of various machine learning pipelines (from data processing to training a machine learning model) for microbiome-based IBD diagnostic tools.ResultsWe collected fifteen 16S rRNA microbiome datasets (7707 samples) from North America to benchmark combinations of gut microbiome features, data normalization methods, batch effect reduction methods, and machine learning models. Pipeline generalizability to new cohorts of patients was evaluated with four binary classification metrics following leave-one dataset-out cross validation, where all samples from one study were left out of the training set and tested upon. We demonstrate that taxonomic features obtained from QIIME2 lead to better classification of samples from IBD patients than inferred functional features obtained from PICRUSt2. In addition, machine learning models that identify non-linear decision boundaries between labels are more generalizable than those that are linearly constrained. Prior to training a non-linear machine learning model on taxonomic features, it is important to apply a compositional normalization method and remove batch effects with the naive zero-centering method. Lastly, we illustrate the importance of generating a curated training dataset to ensure similar performance across patient demographics.ConclusionsThese findings will help improve the generalizability of machine learning models as we move towards non-invasive diagnostic and disease management tools for patients with IBD.
The composition and metabolism of the human gut microbiota are strongly influenced by dietary complex glycans, which cause downstream effects on the physiology and health of hosts. Despite recent advances in our understanding of glycan metabolism by human gut bacteria, we still need methods to link glycans to their consuming bacteria. Here, we use a functional assay to identify and isolate gut bacteria from healthy human volunteers that take up different glycans. The method combines metabolic labeling using fluorescent oligosaccharides with fluorescence-activated cell sorting (FACS), followed by amplicon sequencing or culturomics. Our results demonstrate metabolic labeling in various taxa, such as Prevotella copri, Collinsella aerofaciens and Blautia wexlerae. In vitro validation confirms the ability of most, but not all, labeled species to consume the glycan of interest for growth. In parallel, we show that glycan consumers spanning three major phyla can be isolated from cultures of sorted labeled cells. By linking bacteria to the glycans they consume, this approach increases our basic understanding of glycan metabolism by gut bacteria. Going forward, it could be used to provide insight into the mechanism of prebiotic approaches, where glycans are used to manipulate the gut microbiota composition.
Inflammatory bowel diseases (IBD), subdivided into Crohn’s disease (CD) and ulcerative colitis (UC), are chronic diseases that are characterized by relapsing and remitting periods of inflammation in the gastrointestinal tract. In recent years, the amount of research surrounding digital health (DH) and artificial intelligence (AI) has increased. The purpose of this scoping review is to explore this growing field of research to summarize the role of DH and AI in the diagnosis, treatment, monitoring and prognosis of IBD. A review of 21 articles revealed the impact of both AI algorithms and DH technologies; AI algorithms can improve diagnostic accuracy, assess disease activity, and predict treatment response based on data modalities such as endoscopic imaging and genetic data. In terms of DH, patients utilizing DH platforms experienced improvements in quality of life, disease literacy, treatment adherence, and medication management. In addition, DH methods can reduce the need for in-person appointments, decreasing the use of healthcare resources without compromising the standard of care. These articles demonstrate preliminary evidence of the potential of DH and AI for improving the management of IBD. However, the majority of these studies were performed in a regulated clinical environment. Therefore, further validation of these results in a real-world environment is required to assess the efficacy of these methods in the general IBD population.
Diet-derived polysaccharides are an important carbon source for gut bacteria and shape the human gut microbiome. Acarbose, a compound used clinically to treat type 2 diabetes, is known to inhibit the growth of some bacteria on starches based on its activity as an inhibitor of α-glucosidases and α-amylases. In contrast to acarbose, montbretin A, a new drug candidate for the treatment of type 2 diabetes, has been reported to be more specific for the inhibition of α-amylase, notably human pancreatic α-amylase. However, the effects of both molecules on glycan metabolism across a larger diversity of human gut bacteria remain to be characterized. Here, we used ex vivo metabolic labeling of a human microbiota sample with fluorescent maltodextrin to identify gut bacteria affected by amylase inhibitors. Metabolic labeling was performed in the presence and absence of amylase inhibitors, and the fluorescently labeled bacteria were identified by fluorescence-activated cell sorting coupled with 16S rDNA amplicon sequencing. We validated the labeling results in cultured isolates and identified four gut bacteria species whose metabolism of maltodextrin is inhibited by acarbose. In contrast, montbretin A slowed the growth of only one species, supporting the fact that it is more selective. Metabolic labeling is a valuable tool to characterize glycan metabolism in microbiota samples and could help understand the untargeted impact of drugs on the human gut microbiota.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.