Glycans, the most diverse biopolymer and crucial for many biological processes, are shaped by evolutionary pressures stemming in particular from host-pathogen interactions. While this positions glycans as being essential for understanding and targeting host-pathogen interactions, their considerable diversity and a lack of methods has hitherto stymied progress in leveraging their predictive potential. Here, we utilize a curated dataset of 12,674 glycans from 1,726 species to develop and apply machine learning methods to extract evolutionary information from glycans. Our deep learning-based language model SweetOrigins provides evolution-informed glycan representations that we utilize to discover and investigate motifs used for molecular mimicrymediated immune evasion by commensals and pathogens. Novel glycan alignment methods enable us to identify and contextualize virulence-determining motifs in the capsular polysaccharide of Staphylococcus aureus and Acinetobacter baumannii. Further, we show that glycan-based phylogenetic trees contain most of the information present in traditional 16S rRNA-based phylogenies and improve on the differentiation of genetically closely related but phenotypically divergent species, such as Bacillus cereus and Bacillus anthracis. Leveraging the evolutionary information inherent in glycans with machine learning methodology is poised to provide furthercritically needed -insights into host-pathogen interactions, sequence-to-function relationships, and the major influence of glycans on phenotypic plasticity.In contrast to RNA and proteins, whose sequences can be elucidated from their DNA sequence, glycans are the only biological polymer that falls outside the rules of the central dogma of molecular biology. Glycans are present as modifications on all other biopolymers 1 , exerting varying effects on biomolecules, including stabilization and modulation of their functionality 2,3 . Apart from influencing the function of individual proteins, glycans are also crucial for cell-cell contact 4 and mediate essential developmental processes 5 . Although glycans are synthesized by specific, DNA-encoded enzymes 6 , an individual glycan sequence is dependent on an intricate interplay between multiple enzymes and cellular conditions, providing glycans with important roles in phenotypic plasticity. Different glycosylation sites, even on the same protein, can exhibit considerable glycoform heterogeneity, depending on accessibility and glycosyltransferase kinetics, as has been shown for protein disulfide isomerase 7 . The glycan alphabet exceeds 1,000 monomers, allowing for an astronomical number of potential oligosaccharides built with different monosaccharides, lengths, connectivity, and branching.