Ultra-deep RNA sequencing (RNA-Seq) has become a powerful approach for genome-wide analysis of pre-mRNA alternative splicing. We previously developed multivariate analysis of transcript splicing (MATS), a statistical method for detecting differential alternative splicing between two RNA-Seq samples. Here we describe a new statistical model and computer program, replicate MATS (rMATS), designed for detection of differential alternative splicing from replicate RNASeq data. rMATS uses a hierarchical model to simultaneously account for sampling uncertainty in individual replicates and variability among replicates. In addition to the analysis of unpaired replicates, rMATS also includes a model specifically designed for paired replicates between sample groups. The hypothesis-testing framework of rMATS is flexible and can assess the statistical significance over any user-defined magnitude of splicing change. The performance of rMATS is evaluated by the analysis of simulated and real RNA-Seq data. rMATS outperformed two existing methods for replicate RNA-Seq data in all simulation settings, and RT-PCR yielded a high validation rate (94%) in an RNA-Seq dataset of prostate cancer cell lines. Our data also provide guiding principles for designing RNA-Seq studies of alternative splicing. We demonstrate that it is essential to incorporate biological replicates in the study design. Of note, pooling RNAs or merging RNA-Seq data from multiple replicates is not an effective approach to account for variability, and the result is particularly sensitive to outliers. The rMATS source code is freely available at rnaseq-mats.sourceforge. net/. As the popularity of RNA-Seq continues to grow, we expect rMATS will be useful for studies of alternative splicing in diverse RNA-Seq projects.RNA sequencing | alternative splicing | exon | isoform | transcriptome A lternative splicing generates tremendous transcriptomic and proteomic complexity in higher eukaryotes (1-4). Changes in alternative splicing underlie gene regulation in diverse biological and disease processes (5-7). However, it has been challenging to globally determine and compare gene splicing profiles among biological states. The RNA sequencing (RNA-Seq) technology has become a powerful tool for quantitative profiling of alternative splicing (3,4,8). Due to the high cost, earlier RNA-Seq studies of alternative splicing typically did not incorporate replicates in the study design (9-12). Nonetheless, it is important to note that biological variability remains a critical issue in high-throughput sequencing studies (13). Furthermore, as the cost of sequencing continues to decline, it has become feasible and increasingly common to carry out RNA-Seq on a large number of samples, with sufficient coverage to quantify alternative splicing in each individual sample. This creates an urgent need for new and robust analytic tools to detect alternative splicing changes from replicate RNA-Seq data.Although a variety of computational methods have been developed for RNA-Seq analysis of alternati...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.