2015
DOI: 10.1007/978-3-319-24462-4_19
|View full text |Cite
|
Sign up to set email alerts
|

The Importance of the Regression Model in the Structure-Based Prediction of Protein-Ligand Binding

Abstract: Abstract. Docking is a key computational method for structure-based design of starting points in the drug discovery process. Recently, the use of nonparametric machine learning to circumvent modelling assumptions has been shown to result in a large improvement in the accuracy of docking. As a result, these machine-learning scoring functions are able to widely outperform classical scoring functions. The latter are characterized by their reliance on a predetermined theory-inspired functional form for the relatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
7
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 15 publications
0
7
0
Order By: Relevance
“…The risk of overfitting increases the importance of rigorous validation, 26,27 but the inherent increase in flexibility allows machine learning methods to outperform more constrained methods when trained on an identical input set. 28 The choice of input features can limit the expressiveness of a machine learning method. Features such as atom interaction counts, 22 pairwise atom distance descriptors, 13 interaction fingerprints, 21 or “neural fingerprints” generated by learned atom convolutions 24 necessarily eliminate or approximate the information inherent in a protein-ligand structure, such as precise spatial relationships.…”
Section: Introductionmentioning
confidence: 99%
“…The risk of overfitting increases the importance of rigorous validation, 26,27 but the inherent increase in flexibility allows machine learning methods to outperform more constrained methods when trained on an identical input set. 28 The choice of input features can limit the expressiveness of a machine learning method. Features such as atom interaction counts, 22 pairwise atom distance descriptors, 13 interaction fingerprints, 21 or “neural fingerprints” generated by learned atom convolutions 24 necessarily eliminate or approximate the information inherent in a protein-ligand structure, such as precise spatial relationships.…”
Section: Introductionmentioning
confidence: 99%
“…Traditional approaches have typically used experimental data to parametrize a physically inspired function. While interpretable, these techniques are inherently limited in their ability to capture complex interactions due to the use of rigid functional forms. Many machine learning-based scoring functions reuse the features of traditional approaches , but exploit the greater flexibility in model structure to produce better representations of the same input data . However, this can lead to overfitting and often results in a loss of interpretability.…”
Section: Introductionmentioning
confidence: 99%
“…Many machine learning-based scoring functions reuse the features of traditional approaches 8,17 but exploit the greater flexibility in model structure to produce better representations of the same input data. 18 However, this can lead to overfitting and often results in a loss of interpretability. In addition, the use of specific features, such as descriptors 17,19 or fingerprints, 20 both biases the model to the choice of features and leads to an unnecessary loss of information through the elimination or approximation of the raw structural data.…”
Section: ■ Introductionmentioning
confidence: 99%
“…Machine learning based models, on the other hand, are not necessarily bound by such constraints and have the potential to learn non linear relationships and capture binding features that are hard to model explicitly [16]. This has become an increasingly popular approach to model protein-ligand scoring functions.…”
Section: Introductionmentioning
confidence: 99%