Polyploidy can have a huge impact on the evolution of species, and it is a common occurrence, especially in plants. The two types of polyploids-autopolyploids and allopolyploids-differ in the level of divergence between the genes that are brought together in the new polyploid lineage. Because allopolyploids are formed via hybridization, the homoeologous copies of genes within them are at least as divergent as orthologs in the parental species that came together to form them. This means that common methods for estimating the parental lineages of allopolyploidy events are not accurate, and can lead to incorrect inferences about the number of gene duplications and losses. Here, we have adapted an algorithm for topology-based gene-tree reconciliation to work with multi-labeled trees (MUL-trees). By definition, MUL-trees have some tips with identical labels, which makes them a natural representation of the genomes of polyploids. Using this new reconciliation algorithm we can: accurately place allopolyploidy events on a phylogeny, identify the parental lineages that hybridized to form allopolyploids, distinguish between allo-, auto-, and (in most cases) no polyploidy, and correctly count the number of duplications and losses in a set of gene trees. We validate our method using gene trees simulated with and without polyploidy, and revisit the history of polyploidy in data from the clades including both baker's yeast and bread wheat. Our re-analysis of the yeast data confirms the allopolyploid origin and parental lineages previously identified for this group. The method presented here should find wide use in the growing number of genomes from species with a history of polyploidy. [Polyploidy; reconciliation; whole-genome duplication.].
Polyploidy can have a huge impact on the evolution of species, and it is a common occurrence, especially in plants. The two types of polyploids - autopolyploids and allopolyploids - differ in the level of divergence between the genes that are brought together in the new polyploid lineage. Because allopolyploids are formed via hybridization, the homoeologous copies of genes within them are at least as divergent as orthologs in the parental species that came together to form them. This means that common methods for estimating the parental lineages of allopolyploidy events are not accurate, and can lead to incorrect inferences about the number of gene duplications and losses. Here, we have adapted an algorithm for topology-based gene-tree reconciliation to work with multi-labeled trees (MUL-trees). By definition, MUL-trees have some tips with identical labels, which makes them a natural representation of the genomes of polyploids. Using this new reconciliation algorithm we can: accurately place allopolyploidy events on a phylogeny, identify the parental lineages that hybridized to form allopolyploids, distinguish between allo-, auto-, and (in most cases) no polyploidy, and correctly count the number of duplications and losses in a set of gene trees. We validate our method using gene trees simulated with and without polyploidy, and revisit the history of polyploidy in data from the clades including both baker’s yeast and bread wheat. Our re-analysis of the yeast data confirms the allopolyploid origin and parental lineages previously identified for this group. The method presented here should find wide use in the growing number of genomes from species with a history of polyploidy.
Quantification of gene expression and characterization of gene transcript structures are central problems in molecular biology. RNA sequencing (RNA-Seq) and chromatin immunoprecipitation sequencing (ChIP-Seq) are important methods, but can be cumbersome and difficult for beginners to learn. To teach interested students and scientists how to analyze RNA-Seq and ChIP-Seq data, we present a start-to-finish tutorial for analyzing RNA-Seq and ChIP-Seq data: SeqAcademy (source code:, https://github.com/NCBI-Hackathons/seqacademy webpage:). This user-friendly pipeline, fully written in http://www.seqacademy.org/ Jupyter Notebook, emphasizes the use of publicly available RNA-Seq and ChIP-Seq data and strings together popular tools that bridge that gap between raw sequencing reads and biological insight. We demonstrate practical and conceptual considerations for various RNA-Seq and ChIP-Seq analysis steps with a biological use casea previously published yeast experiment. This work complements existing sophisticated RNA-Seq and ChIP-Seq pipelines designed for advanced users by gently introducing the critical components of RNA-Seq and ChIP-Seq analysis to the novice bioinformatician. In conclusion, this well-documented pipeline will introduce state-of-the-art RNA-Seq and ChIP-Seq analysis tools to beginning bioinformaticians and help facilitate the analysis of the burgeoning amounts of public RNA-Seq and ChIP-Seq data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.