Consistent gene annotation in crops is becoming harder as genomes for new cultivars are frequently published. Gene sets from recently sequenced accessions have different gene identifiers to those on the reference accession, and might be of higher quality due to technical advances. For these reasons there is a need to define pangenes, which represent all known syntenic orthologues for a gene model and can be linked back to the original sources. A pangene set effectively summarizes our current understanding of the coding potential of a crop and can be used to inform gene model annotation in new cultivars. Here we present an approach (get_pangenes) to identify and analyze pangenes that is not biased towards the reference annotation. The method involves computing Whole Genome Alignments (WGA), which are used to estimate gene model overlaps. After a benchmark on Arabidopsis, rice, wheat and barley datasets, we find that two different WGA algorithms (minimap2 and GSAlign) produce similar pangene sets. Our results show that pangenes recapitulate known phylogeny-based orthologies while adding extra core gene models in rice. More importantly, get_pangenes can also produce clusters of genome segments (gDNA) that overlap with gene models annotated in other cultivars. By lifting-over CDS sequences, gDNA clusters can help refine gene models across individuals and confirm or reject observed gene Presence-Absence Variation. Documentation and source code are available at https://github.com/Ensembl/plant-scripts/tree/master/pangenes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.