2008
DOI: 10.1021/ci800022e
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of Virtual Screening Performance of Support Vector Machines Trained by Sparsely Distributed Active Compounds

Abstract: Virtual screening performance of support vector machines (SVM) depends on the diversity of training active and inactive compounds. While diverse inactive compounds can be routinely generated, the number and diversity of known actives are typically low. We evaluated the performance of SVM trained by sparsely distributed actives in six MDDR biological target classes composed of a high number of known actives (983-1645) of high, intermediate, and low structural diversity (muscarinic M1 receptor agonists, NMDA rec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
52
0

Year Published

2008
2008
2023
2023

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 40 publications
(56 citation statements)
references
References 88 publications
4
52
0
Order By: Relevance
“…The vast majority of those methods require the preparation of a training set of compounds (supervised learning) that are used to develop a decision rule that can be then applied to sort a dataset of new molecules (the test set) among particular activity classes [1]. A number of studies have aimed to determine optimal learning parameters and examine their impact on classification effectiveness [2-5]. Interestingly, no extensive research that considers the influence of the ratio of active to inactive training examples on the efficiency of new active compounds recognition has been performed.…”
Section: Introductionmentioning
confidence: 99%
“…The vast majority of those methods require the preparation of a training set of compounds (supervised learning) that are used to develop a decision rule that can be then applied to sort a dataset of new molecules (the test set) among particular activity classes [1]. A number of studies have aimed to determine optimal learning parameters and examine their impact on classification effectiveness [2-5]. Interestingly, no extensive research that considers the influence of the ratio of active to inactive training examples on the efficiency of new active compounds recognition has been performed.…”
Section: Introductionmentioning
confidence: 99%
“…2, resulting in a model that cannot identify an active compound that has similar structure to the putative negative compounds. The extent of this risk is unknown but the results of this work and two other studies [23,27] have shown that such unwanted effect is expected to be relatively small and it was still possible for a substantial proportion of positive compounds to be classified correctly despite their membership in negative families. Nonetheless, the search for known PI3K inhibitors in this work was carried out to be as extensive as possible to minimize this risk.…”
Section: Discussionmentioning
confidence: 81%
“…Thus, this study has adopted the approach by Han et al [19] to generate putative inactive compounds to augment the negative training set. This method can generate putative negatives without requiring the knowledge of actual inactive compounds and studies had shown that classification models derived from these putative negatives can perform reasonably well in virtual screening [23,27]. Nonetheless, the effects of using a large number of putative negatives was examined to ensure that the change is not unacceptably detrimental to the identification of potential inhibitors.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…In-silico methods have been widely explored for facilitating lead discovery against individual targets (33,34). In particular, molecular docking (35), pharmacophore (36), structure-activity relationship (SAR) and quantitative structure activity relationship (QSAR) (37), machine learning (38), and combination methods (39) have been extensively used for searching and designing active compounds against individual targets.…”
Section: In-silico Methods For Searching and Designing Multi-target Dmentioning
confidence: 99%