2019
DOI: 10.1093/bioinformatics/btz629
|View full text |Cite
|
Sign up to set email alerts
|

PeNGaRoo, a combined gradient boosting and ensemble learning framework for predicting non-classical secreted proteins

Abstract: Motivation Gram-positive bacteria have developed secretion systems to transport proteins across their cell wall, a process that plays an important role during host infection. These secretion mechanisms have also been harnessed for therapeutic purposes in many biotechnology applications. Accordingly, the identification of features that select a protein for efficient secretion from these microorganisms has become an important task. Among all the secreted proteins, ‘non-classical’ secreted prote… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
51
0
4

Year Published

2020
2020
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 42 publications
(61 citation statements)
references
References 46 publications
0
51
0
4
Order By: Relevance
“…Although our proposed method showed improved performance over other methods, it still has room for improvement. Recently, several novel computational approaches have been proposed in computational biology [68,78,[80][81][82][83][84][85] to identify function from the sequence. Hence, developing a novel prediction model by utilizing such approaches may improve the prediction performance.…”
Section: Resultsmentioning
confidence: 99%
“…Although our proposed method showed improved performance over other methods, it still has room for improvement. Recently, several novel computational approaches have been proposed in computational biology [68,78,[80][81][82][83][84][85] to identify function from the sequence. Hence, developing a novel prediction model by utilizing such approaches may improve the prediction performance.…”
Section: Resultsmentioning
confidence: 99%
“…We used these datasets for the following reasons. First, the proteins of the positive dataset, NCSPs of grampositive bacterial proteins, were experimentally verified, and each protein was confirmed by at least three different research groups in at least three different bacterial species [5,15]. Second, the sequence identity was reduced to 80 % to avoid potential redundancy.…”
Section: Datasetsmentioning
confidence: 99%
“…An independent test dataset containing 34 positive samples and 34 negative samples was used for further performance evaluation and comparison. For more details regarding the benchmark datasets, see Zhang et al [15].…”
Section: Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…In this study, we leverage particle swarm optimization [34], which is a metaheuristic algorithm simulating the behavior of birds catching food, to tune the essential parameters of the RF model to further improve the accuracy. The essence of PSO is to use the current position, global extremum and individual extremum information to guide the next iteration position of particles, which enables it to approach the optimal solution with a fast convergence speed, and hence effectively optimize the parameters of the model [16], [35]. Another significant advantage of PSO is that it can adjust the maximum step size at each iteration, making it possible to find an approximate optimal solution in a wide range of possible parameters [36].…”
Section: Parameter Optimization Based On Psomentioning
confidence: 99%