2018
DOI: 10.1186/s12859-018-2535-1
|View full text |Cite
|
Sign up to set email alerts
|

Sequence-based bacterial small RNAs prediction using ensemble learning strategies

Abstract: BackgroundBacterial small non-coding RNAs (sRNAs) have emerged as important elements in diverse physiological processes, including growth, development, cell proliferation, differentiation, metabolic reactions and carbon metabolism, and attract great attention. Accurate prediction of sRNAs is important and challenging, and helps to explore functions and mechanism of sRNAs.ResultsIn this paper, we utilize a variety of sRNA sequence-derived features to develop ensemble learning methods for the sRNA prediction. Fi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
19
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 15 publications
(21 citation statements)
references
References 60 publications
2
19
0
Order By: Relevance
“…In this paper, instead of using PSSM and its derived features, the focus was on constructing an efficient sub-Golgi protein RF classifier, namely rfGPT, based only on amino acid and dipeptide composition-based feature vectors. Related studies (Li et al, 2016; Luo et al, 2016; Tang et al, 2018; Zhang et al, 2018a,b) have demonstrated the effectiveness of composition and dipeptide and amino acid composition-based features for solving bioinformatics problems. The rfGPT with 55-dimensional features of 2-gap dipeptide composition attained better jackknife cross-validation scores (ACC = 91.1%; MCC = 0.823; Sn = 87.4%; Sp = 94.7%) and better independent testing results (ACC = 89.1%; MCC = 0.631; Sn = 53.8%; Sp = 98.0%) than those classifiers trained on the same data set (Ding et al, 2013; Jiao and Du, 2016a,b).…”
Section: Introductionmentioning
confidence: 99%
“…In this paper, instead of using PSSM and its derived features, the focus was on constructing an efficient sub-Golgi protein RF classifier, namely rfGPT, based only on amino acid and dipeptide composition-based feature vectors. Related studies (Li et al, 2016; Luo et al, 2016; Tang et al, 2018; Zhang et al, 2018a,b) have demonstrated the effectiveness of composition and dipeptide and amino acid composition-based features for solving bioinformatics problems. The rfGPT with 55-dimensional features of 2-gap dipeptide composition attained better jackknife cross-validation scores (ACC = 91.1%; MCC = 0.823; Sn = 87.4%; Sp = 94.7%) and better independent testing results (ACC = 89.1%; MCC = 0.631; Sn = 53.8%; Sp = 98.0%) than those classifiers trained on the same data set (Ding et al, 2013; Jiao and Du, 2016a,b).…”
Section: Introductionmentioning
confidence: 99%
“…Although there are several experimental methods based on the high-throughput techniques that have been developed to recognize the Ψ modifications, they are both costly and time consuming 13, 14, 15, 16, 17. In addition, researchers are facing an explosive increase of RNA data in the post-genomic age 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30. Therefore, intelligent computational approaches are highly desirable to predict Ψ sites on RNA sequences.…”
Section: Introductionmentioning
confidence: 99%
“…Feature fusion has been successfully applied into bio-sequence analysis (Zhang et al, 2017;Tang et al, 2018;Wei et al, 2018a,b;Liu et al, 2019d) and other bioinformatics tasks (Liang et al, 2018;Zhang et al, 2018Zhang et al, , 2019aGong et al, 2019;Wang et al, 2019). It refers to merge different types of feature representations to more comprehensively capture the characteristics of samples from different perspectives.…”
Section: Feature Fusion and Optimization Protocolmentioning
confidence: 99%