Motivation Alternative splicing is a ubiquitous process in eukaryotes that allows distinct transcripts to be produced from the same gene. Yet, the study of transcript evolution within a gene family is still in its infancy. One prerequisite for this study is the availability of methods to compare sets of transcripts while accounting for their splicing structure. In this context, we generalize the concept of pairwise spliced alignments to multiple spliced alignments (MSpAs). MSpAs have several important purposes in addition to empowering the study of the evolution of transcripts. For instance, it is a key to improving the prediction of gene models, which is important to solve the growing problem of genome annotation. Despite its essentialness, a formal definition of the concept and methods to compute MSpAs are still lacking. Results We introduce the MSpA problem and the SplicedFamAlignMulti (SFAM) method, to compute the MSpA of a gene family. Like most multiple sequence alignment (MSA) methods that are generally greedy heuristic methods assembling pairwise alignments, SFAM combines all pairwise spliced alignments of coding DNA sequences (CDSs) and gene sequences of a gene family into an MSpA. It produces a single structure that represents the super-structure and models of the gene family. Using real vertebrate and simulated gene family data, we illustrate the utility of SFAM for computing accurate gene family super-structures, MSAs, inferring splicing orthologous groups, and improving gene-model annotations. Availability The supporting data and implementation of SFAM are freely available at https://github.com/UdeS-CoBIUS/SpliceFamAlignMulti.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.