2014
DOI: 10.1371/journal.pone.0105902
|View full text |Cite
|
Sign up to set email alerts
|

PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection

Abstract: X-ray crystallography is the primary approach to solve the three-dimensional structure of a protein. However, a major bottleneck of this method is the failure of multi-step experimental procedures to yield diffraction-quality crystals, including sequence cloning, protein material production, purification, crystallization and ultimately, structural determination. Accordingly, prediction of the propensity of a protein to successfully undergo these experimental procedures based on the protein sequence may help na… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
64
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 32 publications
(65 citation statements)
references
References 56 publications
0
64
1
Order By: Relevance
“…This method was selected because it was found to perform better compared to support vector machine (SVM) and artificial neural network methods. Both XtalPred and XtalPred-RF have accuracies around 70%, but the latter has a higher MCC of 0.47.PPCpred [37] and PredPPCrys [51] use a SVM classifier of exposed and buried amino acid compositions, chain disorder, the proximity of certain groups of amino acids and physicochemical properties of individual and collocated amino acids. This large set of features (~800 for PPCpred and ~3000 for PredPPCrys) is then reduced to obtain a best-performing set.…”
Section: Analysis Of Successful Crystallization Conditionsmentioning
confidence: 99%
See 1 more Smart Citation
“…This method was selected because it was found to perform better compared to support vector machine (SVM) and artificial neural network methods. Both XtalPred and XtalPred-RF have accuracies around 70%, but the latter has a higher MCC of 0.47.PPCpred [37] and PredPPCrys [51] use a SVM classifier of exposed and buried amino acid compositions, chain disorder, the proximity of certain groups of amino acids and physicochemical properties of individual and collocated amino acids. This large set of features (~800 for PPCpred and ~3000 for PredPPCrys) is then reduced to obtain a best-performing set.…”
Section: Analysis Of Successful Crystallization Conditionsmentioning
confidence: 99%
“…A reduced set of significant features was then identified. As stated above, this method uses an earlier training dataset, and hence its prediction accuracy drops when tested with newer datasets [51]. The reported accuracy was 87%, with MCC = 0.74.…”
Section: Analysis Of Successful Crystallization Conditionsmentioning
confidence: 99%
“…The authors of PredPPCrys [52] combined a novel dataset, multi-step feature selection, and SVM classification in an attempt to improve the quality of crystallization success predictions. Similar to PPCpred, their model is able to predict the success rate of individual experimental steps in the structural genomic pipeline.…”
Section: Overall Structural Determination Successmentioning
confidence: 99%
“…Prediction accuracy of crystallizability prediction methods were increased by machine learning approaches such as PPCPred (14) that is based on TargetDB as well, however this application incorporated PepcDB (15) and the source data were filtered more rigorously. PredPPCrys (16) and Crysalis (17) were developed to overcome the problem of the overfitting of supervised machine learning techniques by feature selection and they were shown to be the most accurate among other methods. G-protein-coupled receptors (GPCR), for which the used structures resulted from mainly engineered proteins and short fragments that have limited usefulness for further modeling, were ordered by utilizing special propensity score (18) but it was emphasized that GPCRs are highly challenging to crystallize.…”
Section: Introductionmentioning
confidence: 99%