Saccharomyces hybrid yeasts are receiving increasing attention as a powerful model system to understand adaptation to environmental stress and speciation mechanisms, using experimental evolution and omics techniques. We compiled all genomic resources available from public repositories of the eight recognized Saccharomyces species and their interspecific hybrids. We present the newest numbers on genomes sequenced, assemblies, annotations, and sequencing runs, and an updated species phylogeny using orthogroup inference. While genomic resources are highly skewed towards Saccharomyces cerevisiae, there is a noticeable movement to use wild, recently discovered yeast species in recent years. To illustrate the degree and potential causes of reproductive isolation, we reanalyzed published data on hybrid spore viabilities across the entire genus and tested for the role of genetic, geographic, and ecological divergence within and between species (28 cross types and 371 independent crosses). Hybrid viability generally decreased with parental genetic distance likely due to antirecombination and negative epistasis, but notable exceptions emphasize the importance of strain-specific structural variation and ploidy differences. Surprisingly, the viability of crosses within species varied widely, from near reproductive isolation to near-perfect viability. Geographic and ecological origins of the parents predicted cross viability to an extent, but with certain caveats. Finally, we highlight publication trends in the field and point out areas of special interest, where hybrid yeasts are particularly promising for innovation through research and development, and experimental evolution and fermentation. Take AwayThis article provides (1) a quantitative review of all genomic resources of Saccharomyces yeast species and their hybrids, (2) a compilation of published data on reproductive isolation (spore viabilities) within and between species, (3) highlights and trends in the publication efforts within this field, and (4) areas where yeast hybrids are and will be particularly useful for research and development in the future.
Motivation DNA barcodes are short, random nucleotide sequences introduced into cell populations to track the relative counts of hundreds of thousands of individual lineages over time. Lineage tracking is widely applied, e.g. to understand evolutionary dynamics in microbial populations and the progression of breast cancer in humans. Barcode sequences are unknown upon insertion and must be identified using next-generation sequencing technology, which is error prone. In this study, we frame the barcode error correction task as a clustering problem with the aim to identify true barcode sequences from noisy sequencing data. We present Shepherd, a novel clustering method that is based on an indexing system of barcode sequences using k-mers, and a Bayesian statistical test incorporating a substitution error rate to distinguish true from error sequences. Results When benchmarking with synthetic data, Shepherd provides barcode count estimates that are significantly more accurate than state-of-the-art methods, producing 10-150 times fewer spurious lineages. For empirical data, Shepherd produces results that are consistent with the improvements seen on synthetic data. These improvements enable higher resolution lineage tracking and more accurate estimates of biologically relevant quantities, e.g. the detection of small effect mutations. Availability A Python implementation of Shepherd is freely available at: https://www.github.com/Nik-Tavakolian/Shepherd. Supplementary information Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.