Herman van Vlijmen scite author profile

We evaluate 3D models of human nucleoside diphosphate kinase, mouse cellular retinoic acid binding protein I, and human eosinophil neurotoxin that were calculated by MODELLER, a program for comparative protein modeling by satisfaction of spatial restraints. The models have good stereochemistry and are at least as similar to the crystallographic structures as the closest template structures. The largest errors occur in the regions that were not aligned correctly or where the template structures are not similar to the correct structure. These regions correspond predominantly to exposed loops, insertions of any length, and non-conserved side chains. When a template structure with more than 40% sequence identity to the target protein is available, the model is likely to have about 90% of the mainchain atoms modeled with an rms deviation from the X-ray structure of approximately 1 A, in large part because the templates are likely to be that similar to the X-ray structure of the target. This rms deviation is comparable to the overall differences between refined NMR and X-ray crystallography structures of the same protein.

show abstract

Large scale relative protein ligand binding affinities using non-equilibrium alchemy

Gapsys

et al. 2020

View full text Add to dashboard Cite

show abstract

Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set

et al. 2017

View full text Add to dashboard Cite

The increase of publicly available bioactivity data in recent years has fueled and catalyzed research in chemogenomics, data mining, and modeling approaches. As a direct result, over the past few years a multitude of different methods have been reported and evaluated, such as target fishing, nearest neighbor similarity-based methods, and Quantitative Structure Activity Relationship (QSAR)-based protocols. However, such studies are typically conducted on different datasets, using different validation strategies, and different metrics. In this study, different methods were compared using one single standardized dataset obtained from ChEMBL, which is made available to the public, using standardized metrics (BEDROC and Matthews Correlation Coefficient). Specifically, the performance of Naïve Bayes, Random Forests, Support Vector Machines, Logistic Regression, and Deep Neural Networks was assessed using QSAR and proteochemometric (PCM) methods. All methods were validated using both a random split validation and a temporal validation, with the latter being a more realistic benchmark of expected prospective execution. Deep Neural Networks are the top performing classifiers, highlighting the added value of Deep Neural Networks over other more conventional methods. Moreover, the best method (‘DNN_PCM’) performed significantly better at almost one standard deviation higher than the mean performance. Furthermore, Multi-task and PCM implementations were shown to improve performance over single task Deep Neural Networks. Conversely, target prediction performed almost two standard deviations under the mean performance. Random Forests, Support Vector Machines, and Logistic Regression performed around mean performance. Finally, using an ensemble of DNNs, alongside additional tuning, enhanced the relative performance by another 27% (compared with unoptimized ‘DNN_PCM’). Here, a standardized set to test and evaluate different machine learning algorithms in the context of multi-task learning is offered by providing the data and the protocols.Graphical Abstract. Electronic supplementary materialThe online version of this article (doi:10.1186/s13321-017-0232-0) contains supplementary material, which is available to authorized users.

show abstract

Improving the accuracy of protein pKa calculations: Conformational averaging versus the average structure

1998

View full text Add to dashboard Cite

Several methods for including the conformational flexibility of proteins in the calculation of titration curves are compared. The methods use the linearized Poisson-Boltzmann equation to calculate the electrostatic free energies of solvation and are applied to bovine pancreatic trypsin inhibitor (BPTI) and hen egg-white lysozyme (HEWL). An ensemble of conformations is generated by a molecular dynamics simulation of the proteins with explicit solvent. The average titration curve of the ensemble is calculated in three different ways: an average structure is used for the pKa calculation; the electrostatic interaction free energies are averaged and used for the pKa calculation; and the titration curve for each structure is calculated and the curves are averaged. The three averaging methods give very similar results and improve the pKa values to approximately the same degree. This suggests, in contrast to implications from other work, that the observed improvement of pKa values in the present studies is due not to averaging over an ensemble of structures, but rather to the generation of a single properly averaged structure for the pKa calculation.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.