2013
DOI: 10.1186/1758-2946-5-42
|View full text |Cite
|
Sign up to set email alerts
|

Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets

Abstract: BackgroundWhile a large body of work exists on comparing and benchmarking descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 amino acid descriptor sets have been benchmarked with respect to their ability of establishing bioactivity models. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI, BLOSUM, a novel protein descriptor set (termed ProtFP (4 variants)), and in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
89
1

Year Published

2015
2015
2018
2018

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 72 publications
(91 citation statements)
references
References 57 publications
1
89
1
Order By: Relevance
“…We found that all descriptors, with the exception of SOCN, performed at the same level of statistical significance (ANOVA P >0.05; Tukey’s Honest Significance Difference (HSD), α =0.05, n =15) (Table 1 and Figure 2A). These results are in agreement with previous studies, where model performance did not consistently vary across amino acid descriptor sets, and where prediction errors for individual targets were higher than the error differences obtained with models trained on different combinations of amino acid descriptors 26,30,32. Therefore, we conclude that the predictive signal provided by all protein descriptors except SOCN for the modelling of PARP inhibition is comparable.…”
Section: Resultssupporting
confidence: 93%
See 1 more Smart Citation
“…We found that all descriptors, with the exception of SOCN, performed at the same level of statistical significance (ANOVA P >0.05; Tukey’s Honest Significance Difference (HSD), α =0.05, n =15) (Table 1 and Figure 2A). These results are in agreement with previous studies, where model performance did not consistently vary across amino acid descriptor sets, and where prediction errors for individual targets were higher than the error differences obtained with models trained on different combinations of amino acid descriptors 26,30,32. Therefore, we conclude that the predictive signal provided by all protein descriptors except SOCN for the modelling of PARP inhibition is comparable.…”
Section: Resultssupporting
confidence: 93%
“…To determine which protein descriptors provide the highest predictive signal, we benchmarked 8 binding site amino acid (Table 1A)29,30 and 11 full protein sequence descriptors31 (Table 1B). We trained 15 models for each combination of compound and protein descriptors, each time using different resamples to define the training and test sets.…”
Section: Resultsmentioning
confidence: 99%
“…PCM is a branch of chemometrics which uses mathematical and statistical approaches to model the interactions between a series of ligands and a set of receptors. One major strength of PCM is that it does not require structural information of proteins to provide specific information about their functions . Since its first introduction by Lapinsh et al in 2001, the approach has been successfully applied to investigate different protein families such as cytochrome P450, kinases, melanocortin receptors, G protein‐coupled receptors, HIV proteases, aromatases, carbonic anhydrases, and phosphodiesterases .…”
Section: Introductionmentioning
confidence: 99%
“…Unlike the traditional QSAR, in proteochemometric modeling (PCM) approach descriptors of proteins and cross‐terms made from descriptors of ligands and proteins are correlated with the activity data for protein–ligand interactions . A recent study has revealed that combinations of descriptors from different aspect may help increase the performance of proteochemometric modeling .…”
Section: Introductionmentioning
confidence: 99%