2022
DOI: 10.1021/acs.jcim.2c01073
|View full text |Cite
|
Sign up to set email alerts
|

Exposing the Limitations of Molecular Machine Learning with Activity Cliffs

Abstract: Machine learning has become a crucial tool in drug discovery and chemistry at large, e.g., to predict molecular properties, such as bioactivity, with high accuracy. However, activity cliffs�pairs of molecules that are highly similar in their structure but exhibit large differences in potency�have received limited attention for their effect on model performance. Not only are these edge cases informative for molecule discovery and optimization but also models that are well equipped to accurately predict the pote… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

4
108
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 103 publications
(112 citation statements)
references
References 105 publications
4
108
0
Order By: Relevance
“…This can be demonstrated by the number of unique catalysts present in the database that correspond to a particular chemotype. We recorded 27 urea-based catalysts that were good for at least one Mannich reaction which is greater than the number of secondary amines (21), although more reactions have been performed with secondary amines. Similarly, the number of unique cinchona alkaloids ( 16) is high despite a much lower number of reactions reported.…”
Section: ■ Results and Discussionmentioning
confidence: 99%
“…This can be demonstrated by the number of unique catalysts present in the database that correspond to a particular chemotype. We recorded 27 urea-based catalysts that were good for at least one Mannich reaction which is greater than the number of secondary amines (21), although more reactions have been performed with secondary amines. Similarly, the number of unique cinchona alkaloids ( 16) is high despite a much lower number of reactions reported.…”
Section: ■ Results and Discussionmentioning
confidence: 99%
“…This approach has been used to synthesize and test promising compounds, and has resulted in novel therapeutics, including the discovery of a new antibiotic 5 . Many recent advances leverage deep learning formulations [6][7][8][9][10][11][12][13][14][15][16][17][18][19] , but there are still open challenges to realize the full potential of molecular property prediction, including data sparsity and imbalance 17 , and activity cliffs 20 among others. Using only chemical structures might have other limitations due to lack of information on biological contexts or how living organisms respond to treatments.…”
mentioning
confidence: 99%
“…Matched molecular pairs (MMPs) are pairs of molecules that differ structurally at only one site by a known transformation. , MMPs are widely used in drug discovery and medicinal chemistry as these facilitate fast and easy understanding of structure–activity relationships. Counterfactuals and MMP examples intersect if the structural change is associated with a significant change in the properties. If the associated changes in the properties are nonsignificant, the two molecules are known as bioisosteres. , The connection between MMPs and adversarial training examples has been explored in ref . MMPs, which belong to the counterfactual category, are commonly used in outlier and activity cliff detection .…”
Section: Theorymentioning
confidence: 99%
“…If the associated changes in the properties are nonsignificant, the two molecules are known as bioisosteres. 116 , 117 The connection between MMPs and adversarial training examples has been explored in ref ( 118 ). MMPs, which belong to the counterfactual category, are commonly used in outlier and activity cliff detection.…”
Section: Theorymentioning
confidence: 99%