2008
DOI: 10.1016/j.jmgm.2008.08.004
|View full text |Cite
|
Sign up to set email alerts
|

Data mining PubChem using a support vector machine with the Signature molecular descriptor: Classification of factor XIa inhibitors

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

2
45
0

Year Published

2009
2009
2022
2022

Publication Types

Select...
9

Relationship

1
8

Authors

Journals

citations
Cited by 41 publications
(47 citation statements)
references
References 49 publications
2
45
0
Order By: Relevance
“…However, there is still a large amount of high-throughput screening data and six crystal structures available. 23 In the following, we will first give an impression of the differences in hit retrieval with geometrically identical models using each program's default settings. We define models to be 'geometrically identical' if the same coordinates are used for feature placement and if they match the same active compounds in each software tool.…”
Section: Introductionmentioning
confidence: 99%
“…However, there is still a large amount of high-throughput screening data and six crystal structures available. 23 In the following, we will first give an impression of the differences in hit retrieval with geometrically identical models using each program's default settings. We define models to be 'geometrically identical' if the same coordinates are used for feature placement and if they match the same active compounds in each software tool.…”
Section: Introductionmentioning
confidence: 99%
“…Thus, HTS data is typically imbalanced in general with a small ratio of active compounds to inactive ones. Although many researchers (Guha and Schurer, 2008; Han et al ., 2008; Weis et al ., 2008) have noticed this problem when using the data in PubChem, to the best of our knowledge, there has been no method reported to tackle this problem effectively. Han et al .…”
Section: Introductionmentioning
confidence: 99%
“…Weis et al . (2008) suggested the assay data be carefully selected from PubChem to attempt to avoid using the imbalanced data in their study.…”
Section: Introductionmentioning
confidence: 99%
“…Schierz studied the effect of false positive problems of the PubChem bioassay on virtual screening [14]. Moreover, the PubChem bioassay data have been used in various modeling studies: a decision tree approach using the HTS data [15]; a Bayesian approach using the HTS data [16]; a support vector machine (SVM) approach for inhibitor or ligand classifications [17]; and a GPU accelerated SVM approach [18]. In 2010, Xie reviewed the previous applications of bioassay data in PubChem and he also addressed two major challenges in the application of PubChem bioassay data: a biased active/inactive ratio and experimental errors shown as false positives or negatives [19].…”
Section: Introductionmentioning
confidence: 99%