2012
DOI: 10.1007/s00726-012-1416-6
|View full text |Cite
|
Sign up to set email alerts
|

An empirical study on the matrix-based protein representations and their combination with sequence-based approaches

Abstract: Many domains have a stake in the development of reliable systems for automatic protein classification. Of particular interest in recent studies of automatic protein classification is the exploration of new methods for extracting features from a protein that enhance classification for specific problems. These methods have proven very useful in one or two domains, but they have failed to generalize well across several domains (i.e. classification problems). In this paper, we evaluate several feature extraction a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
24
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 22 publications
(24 citation statements)
references
References 74 publications
0
24
0
Order By: Relevance
“…To tackle this problem, a wide range of classification techniques have been implemented and used Shen, 2007a, 2006a;Chou and Shen, 2008;Wan et al, 2013;Chou and Cai, 2002;Yu et al, 2013;Nanni et al, 2013aNanni et al, , 2013bShen and Chou, 2007;Huang and Yuan, 2013). Among these classifiers, Support Vector Machine (SVM) (Wan et al, 2013;Yu et al, 2013;Pierleoni et al, 2011;Du and Yu, 2013;Matsuda et al, 2005) or K-Nearest Neighbor (KNN) based classifiers Shen and Chou, 2010a;Chen et al, 2013aChen et al, , 2012 have attained the most promising results.…”
Section: Introductionmentioning
confidence: 98%
See 1 more Smart Citation
“…To tackle this problem, a wide range of classification techniques have been implemented and used Shen, 2007a, 2006a;Chou and Shen, 2008;Wan et al, 2013;Chou and Cai, 2002;Yu et al, 2013;Nanni et al, 2013aNanni et al, , 2013bShen and Chou, 2007;Huang and Yuan, 2013). Among these classifiers, Support Vector Machine (SVM) (Wan et al, 2013;Yu et al, 2013;Pierleoni et al, 2011;Du and Yu, 2013;Matsuda et al, 2005) or K-Nearest Neighbor (KNN) based classifiers Shen and Chou, 2010a;Chen et al, 2013aChen et al, , 2012 have attained the most promising results.…”
Section: Introductionmentioning
confidence: 98%
“…The main reason is that previous studies failed to capture local discriminatory information embedded in PSSM properly. They have mainly tried to extract this local information using the protein sequence as a single block which has failed to achieve this goal (Nanni et al, 2013b(Nanni et al, , 2013aDehzangi et al, 2014a).…”
Section: Introductionmentioning
confidence: 99%
“…During the last decade, the substitution score (extracted from the PSSM) and predicted secondary structure using PSIBLAST and SPINE-X (or PSIPRED before that) have been widely used in protein science (e.g., protein fold recognition, protein function prediction, protein structure prediction, protein subcellular localization) and extracted features from these sources attained promising results [23], [28], [30], [38], [42], [48]. As it is highlighted in [49], the most sensitive methods for fold recognition use sequence profiles to represent both the query and the data base proteins.…”
Section: Feature Extraction Methodsmentioning
confidence: 99%
“…Compared to the methods adopted to extract global discriminatory information, a wider range of methods were used to extract local discriminatory information for the PFR [42] such as, pseudo amino acid composition [14], [15], [22], [23], [43], cross covariance [28], auto covariance [28], [42], bi-gram [29], [39], and tri-gram [3]. Despite the significant local discriminatory information provided using these approaches, most of these methods produce large number of features as well as large amount of redundant features [14], [29] which makes them computationally expensive for large protein data banks (e.g., cross covariance and tri-gram [3], [28]).…”
Section: Introductionmentioning
confidence: 99%
“…Such techniques use sequence and biological information simultaneously. Several works are already proposed for sequence based classification using ensemble [4,[6][7][8]. In addition, there exist various techniques including Target P, Signal P3.0, WoLF PSORT, TargetLoc, MitoProt II, MITOPRED, MitPred, and Mito-GSAAC, which are all organelle specific methods [4,[9][10][11][12][13][14][15].…”
Section: Introductionmentioning
confidence: 99%