2020
DOI: 10.1073/pnas.2000585117
|View full text |Cite
|
Sign up to set email alerts
|

Machine learning classification can reduce false positives in structure-based virtual screening

Abstract: With the recent explosion in the size of libraries available for screening, virtual screening is positioned to assume a more prominent role in early drug discovery’s search for active chemical matter. In typical virtual screens, however, only about 12% of the top-scoring compounds actually show activity when tested in biochemical assays. We argue that most scoring functions used for this task have been developed with insufficient thoughtfulness into the datasets on which they are trained and tested, leading to… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
134
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 154 publications
(135 citation statements)
references
References 109 publications
1
134
0
Order By: Relevance
“…This shows again that optimal models can be anticipated without any need for PM decoys. Table 1 compiles several freely-available off-the-shelf generic ML-based SFs that could be used on the test set: NNScore [44,45], Δ vina RF 20 [46], Δ vina XGB [47], Convolutional NN [26], RF-Score-VS [27] or vScreenML [48]. For any employed SF, it is necessary to find out which protein-ligand complexes where used for training it and remove them from the test set if also found there.…”
Section: Selecting a Scoring Function Based On Your Own Evaluationmentioning
confidence: 99%
See 1 more Smart Citation
“…This shows again that optimal models can be anticipated without any need for PM decoys. Table 1 compiles several freely-available off-the-shelf generic ML-based SFs that could be used on the test set: NNScore [44,45], Δ vina RF 20 [46], Δ vina XGB [47], Convolutional NN [26], RF-Score-VS [27] or vScreenML [48]. For any employed SF, it is necessary to find out which protein-ligand complexes where used for training it and remove them from the test set if also found there.…”
Section: Selecting a Scoring Function Based On Your Own Evaluationmentioning
confidence: 99%
“…In addition to evaluating the SFs on the entire test set, their performance on certain test subsets would be very informative. For instance, on the subset of complexes bound by molecules dissimilar to any training set molecule, as SFs performing well here should discover a higher proportion of novel compounds [5,8,48]. Another example is the subset of complexes with most potent actives/binders (e.g.…”
Section: Selecting a Scoring Function Based On Your Own Evaluationmentioning
confidence: 99%
“…As a machine-learning-based classification device, the decryptors presented in this work will always show a certain false match rate—a challenge inherent to the field of machine learning classification 38 40 . A number of techniques have therefore been developed to decrease the false match rate in a given classification setting, which can be equally applied to the decryptors presented in this work.…”
Section: Discussionmentioning
confidence: 99%
“…AI can find new molecular compounds and emerging drug targets much faster than traditional methods, thus speeding up the progress of drug development [ 184 , 185 ]. At the same time, AI can more accurately predict the follow-up experimental results of new drugs, so as to improve the accuracy at each stage of drug development [ 186 ]. Computer-aided drug design techniques are thus revolutionizing MSCs therapies.…”
Section: Advances and Perspectives To Overcome Challenges In Msc Clinmentioning
confidence: 99%