2016
DOI: 10.1002/minf.201600085
|View full text |Cite
|
Sign up to set email alerts
|

Multi‐iPPseEvo: A Multi‐label Classifier for Identifying Human Phosphorylated Proteins by Incorporating Evolutionary Information into Chou′s General PseAAC via Grey System Theory

Abstract: Predicting phosphorylation protein is a challenging problem, particularly when query proteins have multi-label features meaning that they may be phosphorylated at two or more different type amino acids. In fact, human protein usually be phosphorylated at serine, threonine and tyrosine. By introducing the "multi-label learning" approach, a novel predictor has been developed that can be used to deal with the systems containing both single- and multi-label phosphorylation protein. Here we proposed a predictor cal… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
6
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 35 publications
(7 citation statements)
references
References 87 publications
1
6
0
Order By: Relevance
“…Also, our result manifested that it appears that using FDAs is essential for the prediction of acetylation functional class, which had been reported in previous research (Qiu et al, 2016a,b, 2017b), and the information related to subcellular is also important for identifying the PTM proteins. As the growing demand of verification of acetylation sites, we argue that more effort should be input in developing organism-specific predictors for this issue.…”
Section: Resultssupporting
confidence: 77%
See 1 more Smart Citation
“…Also, our result manifested that it appears that using FDAs is essential for the prediction of acetylation functional class, which had been reported in previous research (Qiu et al, 2016a,b, 2017b), and the information related to subcellular is also important for identifying the PTM proteins. As the growing demand of verification of acetylation sites, we argue that more effort should be input in developing organism-specific predictors for this issue.…”
Section: Resultssupporting
confidence: 77%
“…In fact, we have made some preliminary exploration and attempt on identifying phosphorylated proteins. In Qiu et al (2017a,b), we presented a method for identifying human phosphorylated proteins and a multi-label classifying model for different type of phosphorylated proteins with the help of the General PseAAC concept and gray system theory. Although the results are not so perfect, we still argue that the formulations and models can be applied to this issue, and it may be more powerful when some structure, function or localization information of proteins were added into the model.…”
Section: Introductionmentioning
confidence: 99%
“…negative site in total 24669) were also extracted from the same protein sequences to maintain consistency. For formulating any post-translational modification site (PTM), a de-facto standard used by the researchers [19][20][21][22] is to extract the PTM site centred (in this experiment, lysine centred) peptide segments of an optimal size. These peptide segments can be expressed as-…”
Section: Datasetsmentioning
confidence: 99%
“…By doing this, a set of benchmark datasets have been obtained. In PTM Site prediction, the presence of homology and redundancy in the peptide segments may bias or overestimate the performance of the predictors, which is also mentioned by different PTM site researchers [19][20][21][22][23]. This study investigates this issue by considering less than 40% pairwise sequence identity in different initial benchmark datasets (for different segment size) using clustering.…”
Section: S = S ∪ Smentioning
confidence: 99%
“…In order to launch a useful sequence-based statistical predictor for a biological system as demonstrated in a series of recent publications [8,15,[28][29][30][31][32][33][34][35], the Chou's five-step rules [36] should be followed: 1) construct or select a valid benchmark dataset to train and test the predictor, 2) formulate the biological sequence samples with an effective mathematical expression that can truly reflect their intrinsic correlation with the target to be predicted, 3) introduce or develop a powerful algorithm (or engine) to operate the prediction, 4) properly perform cross-validation tests to objectively evaluate its anticipated accuracy, and 5) establish a user-friendly webserver that is accessible to the public.…”
Section: Introductionmentioning
confidence: 99%