2022
DOI: 10.48550/arxiv.2201.09637
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for AI-aided Drug Discovery -- A Focus on Affinity Prediction Problems with Noise Annotations

Abstract: AI-aided drug discovery (AIDD) is gaining increasing popularity due to its promise of making the search for new pharmaceuticals quicker, cheaper and more efficient. In spite of its extensive use in many fields, such as ADMET prediction, virtual screening, protein folding and generative chemistry, little has been explored in terms of the out-ofdistribution (OOD) learning problem with noise, which is inevitable in real world AIDD applications.In this work, we present DrugOOD 1 , a systematic OOD dataset curator … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
33
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4

Relationship

4
5

Authors

Journals

citations
Cited by 17 publications
(36 citation statements)
references
References 124 publications
(164 reference statements)
1
33
0
Order By: Relevance
“…Ding et al [7] collect several datasets to compare the performance of well-known baselines and data augmentation methods. DrugOOD [19] is a recent benchmark specifically designed for molecular graph OOD problems. It is curated based on a large-scale bioassay dataset ChEMBL [31] and includes an automated pipeline for obtaining more datasets.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Ding et al [7] collect several datasets to compare the performance of well-known baselines and data augmentation methods. DrugOOD [19] is a recent benchmark specifically designed for molecular graph OOD problems. It is curated based on a large-scale bioassay dataset ChEMBL [31] and includes an automated pipeline for obtaining more datasets.…”
Section: Related Workmentioning
confidence: 99%
“…Although both general OOD problems and graph analysis [21,11,40,46,28] have been intensively studied, graph OOD is only an emerging area of research [44,43,53,5]. Some initial attempts have also been made to curate graph OOD benchmarks [19,7]. However, existing benchmarks lack in several aspects, as detailed in Section 2.…”
Section: Introductionmentioning
confidence: 99%
“…A critical step in drug discovery is to select compounds with high biological activity (Wallach et al, 2015;Li et al, 2021;Ji et al, 2022), diversity and satisfactory ADME (absorption, distribution, metabolism, and excretion) properties (Gimeno et al, 2019). As a result, virtual screening is typically a hierarchical filtering process with several necessary filters, e.g., first choosing the highly active compounds, then selecting diverse subsets from them, and finally excluding compounds that are bad for ADME.…”
Section: Compound Selection In Ai-aided Drug Discoverymentioning
confidence: 99%
“…iii) The optimal transport plan obtained by MOT can be used to design new message passing schemes in graph. iv) The proposed method can be potentially extended for more challenging settings that needs advanced knowledge transfer techniques, such as the graph out of distribution learning [Ji et al, 2022]…”
Section: Ablation Studiesmentioning
confidence: 99%