Ranging from bacterial cell adhesion over viral cell entry to human innate immunity, glycan-binding proteins or lectins are abound in nature. Widely used as staining and characterization reagents in cell biology and crucial for understanding the interactions in biological systems, lectins are a focal point of study in glycobiology. Yet the sheer breadth and depth of specificity for diverse oligosaccharide motifs has made studying lectins a largely piecemeal approach, with few options to generalize. Here, LectinOracle, a model combining transformer-based representations for proteins and graph convolutional neural networks for glycans to predict their interaction, is presented. Using a curated data set of 564,647 unique protein-glycan interactions, it is shown that LectinOracle predictions agree with literature-annotated specificities for a wide range of lectins. Using a range of specialized glycan arrays, it is shown that LectinOracle predictions generalize to new glycans and lectins, with qualitative and quantitative agreement with experimental data. It is further demonstrated that LectinOracle can be used to improve lectin classification, accelerate lectin directed evolution, predict epidemiological outcomes in the context of influenza virus, and analyze whole lectomes in host-microbe interactions. It is envisioned that the herein presented platform will advance both the study of lectins and their role in (glyco)biology.
Glycans are essential to all scales of biology, with their intricate structures being crucial for their biological functions. The structural complexity of glycans is communicated through simplified and unified visual representations according to the Symbol Nomenclature for Glycans (SNFG) guidelines adopted by the community. Here, we introduce GlycoDraw, a Python-native implementation for high-throughput generation of high-quality, SNFG-compliant glycan figures with flexible display options. GlycoDraw is released as part of our glycan analysis ecosystem, glycowork, facilitating integration into existing workflows by enabling fully automated annotation of glycan-related figures and thus assisting the analysis of e.g. differential abundance data or glycomics mass spectra.
Breast milk is abundant with functionalized milk oligosaccharides (MOs), to nourish and protect the neonate. Yet we lack a comprehensive understanding of the repertoire and evolution of MOs across Mammalia. We report ≈400 MO-species associations (>100 novel structures) from milk glycomics of nine mostly understudied species: alpaca, beluga whale, black rhinoceros, bottlenose dolphin, impala, L'Hoest's monkey, pygmy hippopotamus, domestic sheep, and striped dolphin. This revealed the hitherto unknown existence of the LacdiNAc motif (GalNAcβ1-4GlcNAc) in MOs of all species except alpaca, sheep, and striped dolphin, indicating widespread occurrence of this potentially antimicrobial motif in MOs. We also characterize glucuronic acid-containing MOs in the milk of impala, dolphins, sheep, and rhinoceros, previously only reported in cows. We demonstrate that these GlcA-MOs exhibit potent immunomodulatory effects. Our study extends the number of known MOs by >15%. Combined with >1,900 curated MO-species associations, we characterize MO motif distributions, presenting an exhaustive overview of MO biodiversity.
Milk oligosaccharides (MOs) are among the most abundant constituents of breast milk and are essential for health and development. Biosynthesized from monosaccharides into complex sequences, MOs differ considerably between taxonomic groups. Even human MO biosynthesis is insufficiently understood, hampering evolutionary and functional analyses. Using a comprehensive resource of all published MOs from greater than 100 mammals, we develop a nonparametric pipeline for generating and analyzing MO biosynthetic networks, which readily generalizes to other glycan classes. We then use evolutionary relationships and inferred intermediates of these networks to discover (i) distributional glycome biases, (ii) biosynthetic restrictions, such as reaction path dependence, and (iii) conserved biosynthetic modules. This allows us to prune and pinpoint biosynthetic pathways despite missing information. Machine learning and network analysis cluster species by their milk glycome, identifying characteristic sequence relationships and evolutionary gains/losses of motifs, MOs, and biosynthetic modules. These resources and analyses will advance our understanding of glycan biosynthesis and the evolution of breast milk.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.