Immediately after birth, newborn babies experience rapid colonisation by microorganisms from their mothers and the surrounding environment 1. Diseases in childhood and later in life are potentially mediated through perturbation of the infant gut microbiota colonisations 2. However, the impact of modern clinical practices, such as caesarean section delivery and antibiotic usage, on the earliest stages of gut microbiota acquisition and development during the neonatal period (≤1 month) remains controversial 3,4. Here we report disrupted maternal transmission of Bacteroides strains and high-level colonisation by healthcare-associated opportunistic pathogens, including Enterococcus, Enterobacter and Klebsiella species, in babies delivered by caesarean section (C-section), and to a lesser extent, in those delivered vaginally with maternal antibiotic prophylaxis or not breastfed during the neonatal period. Applying longitudinal sampling and whole-genome shotgun metagenomic analysis on 1,679 gut microbiotas of 772 full term, UK-hospital born babies and mothers, we demonstrate that the mode of delivery is a significant factor impacting gut microbiota composition during the neonatal period that persists into infancy (1 month-1 year). Matched large-scale culturing and whole-genome sequencing (WGS) of over 800 bacterial strains cultured from these babies identified virulence factors and clinically relevant antimicrobial resistance (AMR) in opportunistic pathogens that may predispose to opportunistic infections. Our findings highlight the critical early roles of the local environment (i.e. mother and hospital) in establishing the gut microbiota in very early life, and identifies colonisation with AMR carrying, healthcare-associated opportunistic pathogens as a previously unappreciated risk factor.
Motivation: Metagenomics characterizes the taxonomic diversity of microbial communities by sequencing DNA directly from an environmental sample. One of the main challenges in metagenomics data analysis is the binning step, where each sequenced read is assigned to a taxonomic clade. Because of the large volume of metagenomics datasets, binning methods need fast and accurate algorithms that can operate with reasonable computing requirements. While standard alignment-based methods provide state-of-the-art performance, compositional approaches that assign a taxonomic class to a DNA read based on the k-mers it contains have the potential to provide faster solutions.Results: We propose a new rank-flexible machine learning-based compositional approach for taxonomic assignment of metagenomics reads and show that it benefits from increasing the number of fragments sampled from reference genome to tune its parameters, up to a coverage of about 10, and from increasing the k-mer size to about 12. Tuning the method involves training machine learning models on about 108 samples in 107 dimensions, which is out of reach of standard softwares but can be done efficiently with modern implementations for large-scale machine learning. The resulting method is competitive in terms of accuracy with well-established alignment and composition-based tools for problems involving a small to moderate number of candidate species and for reasonable amounts of sequencing errors. We show, however, that machine learning-based compositional approaches are still limited in their ability to deal with problems involving a greater number of species and more sensitive to sequencing errors. We finally show that the new method outperforms the state-of-the-art in its ability to classify reads from species of lineage absent from the reference database and confirm that compositional approaches achieve faster prediction times, with a gain of 2–17 times with respect to the BWA-MEM short read mapper, depending on the number of candidate species and the level of sequencing noise.Availability and implementation: Data and codes are available at http://cbio.ensmp.fr/largescalemetagenomics.Contact: pierre.mahe@biomerieux.comSupplementary information: Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.