2007
DOI: 10.1007/s00726-007-0568-2
|View full text |Cite
|
Sign up to set email alerts
|

Predicting DNA-binding proteins: approached from Chou’s pseudo amino acid composition and other specific sequence features

Abstract: DNA-binding proteins play a pivotal role in gene regulation. It is vitally important to develop an automated and efficient method for timely identification of novel DNA-binding proteins. In this study, we proposed a method based on alone the primary sequences of proteins to predict the DNA-binding proteins. DNA-binding proteins were encoded by autocross-covariance transform, pseudo-amino acid composition, dipeptide composition, respectively and also the different combinations of the three encoded methods; furt… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
69
0

Year Published

2008
2008
2015
2015

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 160 publications
(69 citation statements)
references
References 67 publications
0
69
0
Order By: Relevance
“…Various methods have been developed for extracting features from proteins, including pseudo amino acid composition (Chen and Li 2007;Chou 2001;Ding and Zhang 2008) and Markov chains model (Bulashevska and Eils 2006). The Chou's pseudo amino acid composition (PseAAC) ) is one of the most widely used feature extractors for peptides and proteins (Chen et al 2009;Xiao et al 2011a, b;Fang et al 2008;Nanni and Lumini 2008). While maintaining much of the sequence order information, PseAAC represents a protein sequence using a discrete model that is composed of a set of more than 20 discrete factors.…”
Section: Introductionmentioning
confidence: 99%
“…Various methods have been developed for extracting features from proteins, including pseudo amino acid composition (Chen and Li 2007;Chou 2001;Ding and Zhang 2008) and Markov chains model (Bulashevska and Eils 2006). The Chou's pseudo amino acid composition (PseAAC) ) is one of the most widely used feature extractors for peptides and proteins (Chen et al 2009;Xiao et al 2011a, b;Fang et al 2008;Nanni and Lumini 2008). While maintaining much of the sequence order information, PseAAC represents a protein sequence using a discrete model that is composed of a set of more than 20 discrete factors.…”
Section: Introductionmentioning
confidence: 99%
“…These parameters are also given by following equations (1-4) [77]: TN FP.FN ) (TP + FN ). (TP + FP).…”
Section: Resultsmentioning
confidence: 99%
“…For a summary about its recent development and applications, please see a comprehensive review. [35] Ever since the concept of PseAAC was proposed by Chou [34] in 2001, it has rapidly penetrated into almost all the fields of protein attribute prediction, such as identifying bacterial virulent proteins, [36] predicting homo-oligomeric proteins, [37] predicting anticancer peptides, [38] predicting protein secondary structure content, [39] predicting supersecondary structure, [40] predicting protein structural classes, [41,42] predicting protein quaternary structure, [43] predicting enzyme family and subfamily classes, [44][45][46] predicting protein subcellular location, [47,48] predicting subcellular localization of apoptosis proteins, [49][50][51][52] predicting protein subnuclear location, [43] predicting protein submitochondria locations, [53][54][55] identifying cell wall lytic enzymes, [56] identifying risk type of human papillomaviruses, [57] identifying DNA-binding proteins, [3] predicting G-Protein-Coupled Receptor Classes, [58][59] predicting protein folding rates, [60] predicting outer membrane proteins, [61] predicting cyclin proteins, [62] predicting GABA(A) receptor proteins, [63] identifying bacterial secreted proteins, [64] identifying the cofactors of oxidoreductases, [65] identifying lipase types, [66] identifying protease family, [67] predicting Golgi protein types, [68] classifying amino acids, …”
Section: Pseudo Amino Acid Composition (Pseaac)mentioning
confidence: 99%
“…Amino acid composition of proteins associated with the biochemical properties are the commonly used sequence-based features, for example Cai and Lin [1] used protein's amino acid composition, limited range correlation of hydrophobicity and solvent accessible surface area to identify DBPs; Ahmad et al [2] found the specificity of sequence level and binding level and analyzed the relationship between them; Fang et al [3] encoded the feature space by autocross-covariance (ACC) transform, pseudoamino acid composition, dipeptide composition; Zou et al [4] adopted three different feature transformation methods to generate numeric feature vectors from protein sequences; Lin et al [5] represented each sequence as pseudo amino acid composition by applied grey model. For more accurately predictive performance, the combinations of different features were employed, for example Kumar et al [6] derived sequence properties by frequency of amino acid, amino acid groups, secondary structure, comAbstract: Identification of DNA-binding proteins is an important problem in biomedical research as DNA-binding proteins are crucial for various cellular processes.…”
Section: Introductionmentioning
confidence: 99%