2018
DOI: 10.1093/bib/bby034
|View full text |Cite
|
Sign up to set email alerts
|

Pattern recognition analysis on long noncoding RNAs: a tool for prediction in plants

Abstract: Our feature selection analysis considered 5468 features, and it used only 16 features to robustly identify lncRNA with the REPTree algorithm. That was the base to create the model and train it with lncRNA and mRNA data from five plant species (thale cress, cucumber, soybean, poplar and Asian rice). After an extensive comparison with other tools largely used in plants (CPC, CPC2, CPAT and PLncPRO), we found that RNAplonc produced more reliable lncRNA predictions from plant transcripts with 87.5% of the best res… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
40
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 56 publications
(41 citation statements)
references
References 32 publications
0
40
0
1
Order By: Relevance
“…A detailed list of plant lncRNA databases is in Table 3. Several important tools, such as CPPred [158], REPTree [159], Pfamscan [160], COME [161], PLIT [156], and CPC2 [162], are available to distinguish lncRNAs from mRNAs. Advances in bioinformatics tools and new algorithms could further boost our efforts in discovering novel lncRNAs and their accurate functional annotations.…”
Section: Database and Web-based Resources Of Lncrnasmentioning
confidence: 99%
“…A detailed list of plant lncRNA databases is in Table 3. Several important tools, such as CPPred [158], REPTree [159], Pfamscan [160], COME [161], PLIT [156], and CPC2 [162], are available to distinguish lncRNAs from mRNAs. Advances in bioinformatics tools and new algorithms could further boost our efforts in discovering novel lncRNAs and their accurate functional annotations.…”
Section: Database and Web-based Resources Of Lncrnasmentioning
confidence: 99%
“…Various tools are available to evaluate the coding potential of transcripts and distinguish noncoding RNAs from protein-coding ones using machine learning approaches. These methods have used a variety of models, such as support vector machine (Liu et al, 2006;Kong et al, 2007;Sun et al, 2013aSun et al, , 2013bLi et al, 2014;Kang et al, 2017;Vieira et al, 2017), random forest (Hu et al, 2017;Singh et al, 2017;Wucher et al, 2017), logistic regression (Wang et al, 2013), and REPTree (Negri et al, 2018). The features used by these machine learning approaches are usually extracted directly from nucleotide sequences.…”
mentioning
confidence: 99%
“…The Random Forest (RF) classifier [33] was used to build the gene prediction model, obtaining better performance when compared to other state-of-the-art predictors. The RF method was chosen based on our previous studies [14] and it was used in similar cases with good performance [34] [35]. Four models having 100, 200, 500 and 700 decision trees with 5-fold cross-validation with 5 repetitions on the performance evaluation training set were built (Figure 4).…”
Section: Random Forest Parametrizationmentioning
confidence: 99%