Motivation Genome-wide analysis of alternative splicing has been a very active field of research since the early days of Next Generation Sequencing technologies. Since then, ever-growing data availability and the development of increasingly sophisticated analysis methods have uncovered the complexity of the general splicing repertoire. A large number of splicing analysis methodologies exist, each of them presenting its own strengths and weaknesses. For instance methods exclusively relying on junction information do not take advantage of the large majority of reads produced in an RNA-seq assay, isoform reconstruction methods might not detect novel intron retention events, some solutions can only handle canonical splicing events, and many existing methods can only perform pairwise comparisons. Results In this contribution, we present ASpli, a computational suite implemented in R statistical language, that allows the identification of changes in both, annotated and novel alternative splicing events and can deal with simple, multi-factor or paired experimental designs. Our integrative computational workflow considers the same GLM model, applied to different sets of reads and junctions, in order to compute complementary splicing signals.Analyzing simulated and real data we found that the consolidation of these signals resulted in a robust proxy of the occurrence of splicing alterations. While the analysis of junctions allowed us to uncover annotated as well as non-annotated events, read coverage signals notably increased recall capabilities at a very competitive performance when compared against other state-of-the-art splicing analysis algorithms. ASpli is freely available from the Bioconductor project site https://www.bioconductor.org/packages/ASpli Supplementary information Supplementary data are available at Bioinformatics online.
Genome-wide analysis of alternative splicing has been a very active field of research since the early days of NGS (Next generation sequencing) technologies. Since then, ever-growing data availability and the development of increasingly sophisticated analysis methods have uncovered the complexity * Corresponding author 1 Equally contributed as well and non-annotated events, bin-associated signals notably increased recall capabilities at a very competitive performance in terms of precision.The vast majority of protein coding genes in eukaryotic organisms are 2 transcribed into precursor RNA messenger molecules (pre-mRNA) carrying 3 protein coding regions (exons) interleaved by non-coding ones (introns). The 4 later are removed in a co-transcriptional dynamical maturation process called 5 splicing. Alternative splicing (AS) occurs whenever distinct splicing sites are 6 selected in this process resulting in different mature mRNA molecules [1, 2]. 7 Far from being an exception, it was found that AS is a rather common 8 mechanism of gene regulation that serves to expand the functional diversity 9 of a single gene allowing the generation of multiple mRNA isoforms from a 10 single genomic locus [3]. Five basic modes of AS are generally recognized: the 11 skipping of a given exon (exon skipping, ES), the exon elongation/contrac-12 tion produced by the use of alternative 5' donor (Alt5') or 3' acceptor (Alt3') 13 sites respectively, the retention of an intronic stretch in the mature mRNA 14 form (intron retention IR), and the alternative use of mutually exclusive ex-15 ons (MEX). These canonical forms of AS are prevalent among eukaryotes, 16 although their relative incidence might vary between them [4]. Despite their 17 ubiquity, these simple patterns that mainly involve binary choices of exons, 18 donor and acceptor sites, do not exhaust the splicing repertoire. On the con-19 trary, much more complex biologically relevant patterns could arise [5, 6]. In 20 practice the study of AS faces many technical challenges that cause that every 21 2 quantitative approach typically suffers methodological limitations. Despite 22 the use of different statistical approaches, some methods consider only pre-23 existing known annotation, some can exclusively handle canonical splicing 24 events and some can only handle pairwise comparisons between conditions 25 (for a comprehensive review see [7, 8, 9]). 26 27 The analysis of AS at genomic scale started-in with microarray technolo-28 gies [10, 11] and nowadays is routinely probed using RNAseq assays [12, 13]. 29 Roughly speaking, there are three main computational approaches to study 30 splicing diversity from RNAseq data. For one hand there are transcript re-31 construction methods, like MISO [14] or Cufflink [15] that aim to infer a 32 probabilistic model of the frequency of each isoform from the read distribu-33 tion mapped to a given gene. In the same spirit, Kallisto [16] and Salmon[17] 34 are two recently introduced methods that leverage on light-weight pseudo-35 alignment heuri...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.