2009
DOI: 10.2174/138620709788167980
|View full text |Cite
|
Sign up to set email alerts
|

Machine Learning in Virtual Screening

Abstract: In this review, we highlight recent applications of machine learning to virtual screening, focusing on the use of supervised techniques to train statistical learning algorithms to prioritize databases of molecules as active against a particular protein target. Both ligand-based similarity searching and structure-based docking have benefited from machine learning algorithms, including naïve Bayesian classifiers, support vector machines, neural networks, and decision trees, as well as more traditional regression… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
137
0
5

Year Published

2011
2011
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 192 publications
(142 citation statements)
references
References 4 publications
0
137
0
5
Order By: Relevance
“…The GA seeks to identify those weights that produce the best possible ranking of the molecules in a dataset, and hence to estimate an upper-bound to the effectiveness of virtual screening possible using the substructural analysis approach. The basic idea is illustrated in Figure 1 using a training-set containing three molecules M 1-3 , each of which is represented by a fingerprint encoding the presence or absence of five fragments F [1][2][3][4][5] .…”
Section: The Genetic Algorithmmentioning
confidence: 99%
See 2 more Smart Citations
“…The GA seeks to identify those weights that produce the best possible ranking of the molecules in a dataset, and hence to estimate an upper-bound to the effectiveness of virtual screening possible using the substructural analysis approach. The basic idea is illustrated in Figure 1 using a training-set containing three molecules M 1-3 , each of which is represented by a fingerprint encoding the presence or absence of five fragments F [1][2][3][4][5] .…”
Section: The Genetic Algorithmmentioning
confidence: 99%
“…An initial population of possible solutions is generated with the initial weights W 1 -W 5 being assigned by a randomnumber generator that has been primed in this simple example to generate integer weights in the range 0-10. In the example, the population contains six chromosomes, C [1][2][3][4][5][6] , and the initial population is shown in Figure 1b. Each chromosome is then used to compute the sum-of-weights for each molecule, as shown in Figure 1c.…”
Section: The Genetic Algorithmmentioning
confidence: 99%
See 1 more Smart Citation
“…A more sophisticated representation is obtained when each bit is replaced by an integer or real value that reflects a fragment's specific contribution to the calculation of inter-molecular similarity, with the largest weights being assigned to the most important fragments. Fragment weights are widely used in machine learning approaches to virtual screening [18][19][20] but these require the availability of extensive training-sets of active and inactive molecules, whereas in similarity searching the only information that is typically available is a single bioactive reference structure. That said, there is one additional source of information that could be exploited in a similarity calculation, viz information about the frequencies with which fragments occur: either the frequencies with which they occur within individual molecules; or the frequencies with which they occur in the entire database that is being searched.…”
Section: Similarity-based Virtual Screeningmentioning
confidence: 99%
“…Data mining approaches based on cheminformatics modeling has been extensively used to prioritize molecules from large chemical datasets for specific biological activities. Such in-silico prioritization of molecules has been suggested to accelerate drug discovery by drastically reducing the time and cost-factor in conventional drug discovery processes [17][18][19][20].…”
Section: Introductionmentioning
confidence: 99%