2017
DOI: 10.1186/s13321-017-0232-0
|View full text |Cite
|
Sign up to set email alerts
|

Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set

Abstract: The increase of publicly available bioactivity data in recent years has fueled and catalyzed research in chemogenomics, data mining, and modeling approaches. As a direct result, over the past few years a multitude of different methods have been reported and evaluated, such as target fishing, nearest neighbor similarity-based methods, and Quantitative Structure Activity Relationship (QSAR)-based protocols. However, such studies are typically conducted on different datasets, using different validation strategies… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

11
245
2
1

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 256 publications
(259 citation statements)
references
References 41 publications
(52 reference statements)
11
245
2
1
Order By: Relevance
“…[206] Current achievements in reaction prediction and retrosynthesis demonstrate the potential of machine learning to solve one of the bottlenecks of state-of-the-art materials design, which is the planning of efficient reaction routes of new molecules. Examples can be given for, e.g., feature detection, [207] bioactivity prediction, [208] or drug target prediction, [209] and others. Examples can be given for, e.g., feature detection, [207] bioactivity prediction, [208] or drug target prediction, [209] and others.…”
Section: Discussionmentioning
confidence: 99%
“…[206] Current achievements in reaction prediction and retrosynthesis demonstrate the potential of machine learning to solve one of the bottlenecks of state-of-the-art materials design, which is the planning of efficient reaction routes of new molecules. Examples can be given for, e.g., feature detection, [207] bioactivity prediction, [208] or drug target prediction, [209] and others. Examples can be given for, e.g., feature detection, [207] bioactivity prediction, [208] or drug target prediction, [209] and others.…”
Section: Discussionmentioning
confidence: 99%
“…Although there are several chemistry problems where DNNs outperform other shallow machine learning methods 49,59,60 , here the MFP+RF performed best with the small dataset of 676 molecules in the 5-and 12-class predictions. However, in the 3class task with the small dataset, and all the tasks with the large dataset, the two models produced accuracies that were nearly indistinguishable.…”
Section: Discussionmentioning
confidence: 85%
“…For tankyrase, due to the relatively small number of known inhibitors, the improvement from 0.81 to 0.93 (random test set split) or 0.89 (time‐based split) was obtained. As can be seen, the Deep Neural Networks (DNN) generally yield the best performing models, consistent with the results obtained previously . Using the network with two hidden layers (300 and 100 neurons), the external test AUC ROC values of 0.96 and 0.93 are obtained for the random split and 0.90 and 0.84 for the time‐based split of the PI3Kα and tankyrase inhibitors, respectively.…”
Section: Resultsmentioning
confidence: 99%