BackgroundMicroRNAs (miRNAs) are small, single stranded RNAs with a key role in post-transcriptional regulation of thousands of genes across numerous species. While several computational methods are currently available for identifying miRNA genes, accurate prediction of the mature miRNA remains a challenge. Existing approaches fall short in predicting the location of mature miRNAs but also in finding the functional strand(s) of miRNA precursors.Methodology/Principal FindingsHere, we present a computational tool that incorporates a Naive Bayes classifier to identify mature miRNA candidates based on sequence and secondary structure information of their miRNA precursors. We take into account both positive (true mature miRNAs) and negative (same-size non-mature miRNA sequences) examples to optimize sensitivity as well as specificity. Our method can accurately predict the start position of experimentally verified mature miRNAs for both human and mouse, achieving a significantly larger (often double) performance accuracy compared with two existing methods. Moreover, the method exhibits a very high generalization performance on miRNAs from two other organisms. More importantly, our method provides direct evidence about the features of miRNA precursors which may determine the location of the mature miRNA. We find that the triplet of positions 7, 8 and 9 from the mature miRNA end towards the closest hairpin have the largest discriminatory power, are relatively conserved in terms of sequence composition (mostly contain a Uracil) and are located within or in very close proximity to the hairpin loop, suggesting the existence of a possible recognition site for Dicer and associated proteins.ConclusionsThis work describes a novel algorithm for identifying the start position of mature miRNA(s) produced by miRNA precursors. Our tool has significantly better (often double) performance than two existing approaches and provides new insights about the potential use of specific sequence/structural information as recognition signals for Dicer processing. Web Tool available at: http://mirna.imbb.forth.gr/MatureBayes.html
The majority of existing computational tools rely on sequence homology and/or structural similarity to identify novel microRNA (miRNA) genes. Recently supervised algorithms are utilized to address this problem, taking into account sequence, structure and comparative genomics information. In most of these studies miRNA gene predictions are rarely supported by experimental evidence and prediction accuracy remains uncertain. In this work we present a new computational tool (SSCprofiler) utilizing a probabilistic method based on Profile Hidden Markov Models to predict novel miRNA precursors. Via the simultaneous integration of biological features such as sequence, structure and conservation, SSCprofiler achieves a performance accuracy of 88.95% sensitivity and 84.16% specificity on a large set of human miRNA genes. The trained classifier is used to identify novel miRNA gene candidates located within cancer-associated genomic regions and rank the resulting predictions using expression information from a full genome tiling array. Finally, four of the top scoring predictions are verified experimentally using northern blot analysis. Our work combines both analytical and experimental techniques to show that SSCprofiler is a highly accurate tool which can be used to identify novel miRNA gene candidates in the human genome. SSCprofiler is freely available as a web service at http://www.imbb.forth.gr/SSCprofiler.html.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.