2017
DOI: 10.3390/ncrna3010011
|View full text |Cite
|
Sign up to set email alerts
|

PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants

Abstract: Non-coding RNAs (ncRNAs) constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs), which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they canno… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
12
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 15 publications
(12 citation statements)
references
References 39 publications
0
12
0
Order By: Relevance
“…Various tools are available to evaluate the coding potential of transcripts and distinguish noncoding RNAs from protein-coding ones using machine learning approaches. These methods have used a variety of models, such as support vector machine (Liu et al, 2006;Kong et al, 2007;Sun et al, 2013aSun et al, , 2013bLi et al, 2014;Kang et al, 2017;Vieira et al, 2017), random forest (Hu et al, 2017;Singh et al, 2017;Wucher et al, 2017), logistic regression (Wang et al, 2013), and REPTree (Negri et al, 2018). The features used by these machine learning approaches are usually extracted directly from nucleotide sequences.…”
mentioning
confidence: 99%
“…Various tools are available to evaluate the coding potential of transcripts and distinguish noncoding RNAs from protein-coding ones using machine learning approaches. These methods have used a variety of models, such as support vector machine (Liu et al, 2006;Kong et al, 2007;Sun et al, 2013aSun et al, , 2013bLi et al, 2014;Kang et al, 2017;Vieira et al, 2017), random forest (Hu et al, 2017;Singh et al, 2017;Wucher et al, 2017), logistic regression (Wang et al, 2013), and REPTree (Negri et al, 2018). The features used by these machine learning approaches are usually extracted directly from nucleotide sequences.…”
mentioning
confidence: 99%
“…To train the model, positive and negative datasets of 1000 transcripts each were built. From 2432 sugarcane transcripts mapped on intergenic regions, 1689 were classified as lincRNAs by the SVM model and 97 by BLAST [ 32 ]. Finally, a total of 67 transcripts were classified as lincRNAs by both BLAST and the SVM model.…”
Section: Resultsmentioning
confidence: 99%
“…Using data from sugarcane transcriptome datasets [ 30 ], lincRNAs of sugarcane were identified with a pipeline available in Figure S1 , including a specific designed SVM (Support Vector Machine) model [ 32 ]. The pipeline was constructed using data from Sorghum bicolor , a plant evolutionarily close to sugarcane for which the full genome sequence is known.…”
Section: Methodsmentioning
confidence: 99%
“…CONC performs slowly on analyzing large ncRNA datasets, while CPC is suitable for known protein-coding transcripts. However, CPC may tend to classify novel PCTs into ncRNAs not recorded in the protein databases [111,118]. The most common methods use pairwise comparisons (CPC [116] and PORTRAIT [117]) or multiple alignments (PhyloCSF [119] and RNAcode [120]).…”
Section: Bioinformatics Approachesmentioning
confidence: 99%