Background
The removal of introns occurs through the splicing of a 5′ splice site (5′ss) with a 3′ splice site (3′ss). These two elements are recognized by distinct components of the spliceosome. However, introns in higher eukaryotes contain many matches to the 5′ and 3′ splice-site motifs that are presumed not to be used.
Results
Here, we find that many of these sites can be used. We also find occurrences of the AGGT motif that can function as either a 5′ss or a 3′ss—previously referred to as dual-specific splice sites (DSSs)—within introns. Analysis of the Sequence Read Archive reveals a 3.1-fold enrichment of DSSs relative to expectation, implying synergy between the ability to function as a 5′ss and 3′ss. Despite this suggested mechanistic advantage, DSSs are 2.7- and 4.7-fold underrepresented in annotated 5′ and 3′ splice sites. A curious exception is the polyubiquitin gene UBC, which contains a tandem array of DSSs that precisely delimit the boundary of each ubiquitin monomer. The resulting isoforms splice stochastically to include a variable number of ubiquitin monomers. We found no evidence of tissue-specific or feedback regulation but note the 8.4-fold enrichment of DSS-spliced introns in tandem repeat genes suggests a driving role in the evolution of genes like UBC.
Conclusions
We find an excess of unannotated splice sites and the utilization of DSSs in tandem repeats supports the role of splicing in gene evolution. These findings enhance our understanding of the diverse and complex nature of the splicing process.