The RNA sequences are the major materials accessible for the nuclear splicing machinery, therefore, understanding how they are transformed into a binary decision of intron removal and exon ligation is critical in resolving the mystery of pre-mRNA splicing. This paper proposed an exon/intron discrimination framework (EIDF) to profile the intrinsic differences between exons and their immediate introns based on information of single sequence. The EIDF focuses on the frequencies of specific mono-/di-/ tri-nucleotides in the individual sequence and a simple exon/intron classifier is implemented accordingly. The experimental results showed the proposed EIDF is a valuable profile of splice site sequences and the possibility of simulating the processes of splicing machinery in silico is also revealed.
The splice sites are essential for pre-mRNA maturation and crucial for Splice Site Modelling (SSM); however, there are gaps between the splicing signals and the computationally identified sequence features. In this paper, the Locality Sensitive Features (LSFs) are proposed to reduce the gaps by homogenising their contexts. Under the skewness-kurtosis based statistics and data analysis, SSM attributed with LSFs is fulfilled by double-boundary outlier filters. The LSF-based SSM had been applied to six model organisms of diverse species; by the accuracy and Receiver Operating Characteristic (ROC) analysis, the promising results show the proposed methodology is versatile and robust for the splice-site classification. It is prospective the LSF-based SSM can serve as a new infrastructure for developing effective splice-site prediction methods and have the potential to be applied to other sequence prediction problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.