2012
DOI: 10.1371/journal.pone.0031057
|View full text |Cite
|
Sign up to set email alerts
|

An Ensemble Classifier for Eukaryotic Protein Subcellular Location Prediction Using Gene Ontology Categories and Amino Acid Hydrophobicity

Abstract: With the rapid increase of protein sequences in the post-genomic age, it is challenging to develop accurate and automated methods for reliably and quickly predicting their subcellular localizations. Till now, many efforts have been tried, but most of which used only a single algorithm. In this paper, we proposed an ensemble classifier of KNN (k-nearest neighbor) and SVM (support vector machine) algorithms to predict the subcellular localization of eukaryotic proteins based on a voting system. The overall predi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
35
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 61 publications
(35 citation statements)
references
References 61 publications
0
35
0
Order By: Relevance
“…Being one type of general PseAAC [32], the GO (Gene Ontology) has been widely used to improve the prediction quality of protein subcellular localization (see, e.g., [23,25,26,[87][88][89][90][91]). The advantage of using the GO approach is that proteins mapped into the GO space (instead of Euclidean space or any other simple geometric space) would be better clustered according to their subcellular locations, as elaborated in [9,92].…”
Section: Proteins Sample Formulationmentioning
confidence: 99%
“…Being one type of general PseAAC [32], the GO (Gene Ontology) has been widely used to improve the prediction quality of protein subcellular localization (see, e.g., [23,25,26,[87][88][89][90][91]). The advantage of using the GO approach is that proteins mapped into the GO space (instead of Euclidean space or any other simple geometric space) would be better clustered according to their subcellular locations, as elaborated in [9,92].…”
Section: Proteins Sample Formulationmentioning
confidence: 99%
“…Not only protein sequence information but also prediction algorithms could affect the accuracy of subcellular localization prediction (Li et al, 2012). To date, many computational techniques, such as the neural network (Zou et al, 2007), K-nearest neighbor (KNN) (Chou et al, 2006;Xiao et al, 2011b;He et al, 2012), fuzzy KNN (Gu et al, 2010), and Bayesian (Briesemeister et al, 2010a;Simha et al, 2014;Simha et al, 2015), have been introduced for the prediction of protein subcellular localization.…”
Section: Introductionmentioning
confidence: 99%
“…In recent times, a support vector machine (SVM) (Höglund et al, 2006;Li et al, 2012;Wan et al, 2012;Wan et al, 2014;Hasan et al, 2015,) has been extensively applied to provide potential solutions for the prediction of protein subcellular localization. However, the selection of an appropriate kernel and its parameters for a given classification problem influences the performance of the SVM.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, several state-of-the-art multi-label predictors have been proposed, such as Hum-mPLoc 2.0 [51], iLoc-Hum [27], mGOASVM [61], HybridGO-Loc [62], R3P-Loc [63], mPLRLoc [64] and other predictors [65,66,67]. They all use the GO information as the features and apply different multi-label classifiers to tackle the multi-label classification problem.…”
Section: Introductionmentioning
confidence: 99%