Splice isoform structure and abundance can be affected by either non-coding or masquerading coding variants that alter the structure or abundance of transcripts. When these variants are common in the population, these non-constitutive transcripts are sufficiently frequent so as to resemble naturally occurring, alternative mRNA splicing. Prediction of the effects of such variants has been shown to be accurate using information theory-based methods. Single nucleotide polymorphisms (SNPs) predicted to significantly alter natural and/or cryptic splice site strength were shown to affect gene expression. Splicing changes for known SNP genotypes were confirmed in HapMap lymphoblastoid cell lines with gene expression microarrays and custom designed q-RT-PCR or TaqMan assays. The majority of these SNPs (15 of 22) as well as an independent set of 24 variants were then subjected to RNAseq analysis using the ValidSpliceMut web beacon (http://validsplicemut.cytognomix.com), which is based on data from the Cancer Genome Atlas and International Cancer Genome Consortium. SNPs from different genes analyzed with gene expression microarray and q-RT-PCR exhibited significant changes in affected splice site use. Thirteen SNPs directly affected exon inclusion and 10 altered cryptic site use. Homozygous SNP genotypes resulting in stronger splice sites exhibited higher levels of processed mRNA than alleles associated with weaker sites. Four SNPs exhibited variable expression among individuals with the same genotypes, masking statistically significant expression differences between alleles. Genome-wide information theory and expression analyses (RNAseq) in tumour exomes and genomes confirmed splicing effects for 7 of the HapMap SNP and 14 SNPs identified from tumour genomes. q-RT-PCR resolved rare splice isoforms with read abundance too low for statistical significance in ValidSpliceMut.Nevertheless, the web-beacon provides evidence of unanticipated splicing outcomes, for example, intron retention due to compromised recognition of constitutive splice sites. Thus, ValidSpliceMut and q-RT-PCR represent complementary resources for identification of allelespecific, alternative splicing.
KeywordsAllele-specific gene expression, mRNA splicing, RT-PCR, gene expression microarray, RNAseq, single nucleotide polymorphism, mutation, cryptic splicing; intron retention; alternative splicing, information theory Accurate and comprehensive methods are needed for predicting impact of non-coding mutations, in particular, mRNA splicing defects, which are prevalent in genetic disease (Krawczak et al. 1992;Teraoka et al. 1999;Ars et al. 2003;Spielmann and Mundlos, 2016;Gloss and Dinger, 2018). This class of mutations may account for as much as 62% of point mutations (López-Bigas et al. 2005). Large transcriptome studies have suggested that a large fraction of genome-wide association studies (GWAS) signals for disease and complex traits are due to SNPs affecting mRNA splicing (Park et al. 2018). ValidSpliceMut (Shirley et al. 2019) presents evidence of altere...