2023
DOI: 10.1002/minf.202300056
|View full text |Cite
|
Sign up to set email alerts
|

Exploring activity landscapes with extended similarity: is Tanimoto enough?

Abstract: Understanding structure-activity landscapes is essential in drug discovery.Similarly, it has been shown that the presence of activity cliffs in compound data sets can have a substantial impact not only on the design progress but also can influence the predictive ability of machine learning models. With the continued expansion of the chemical space and the currently available large and ultra-large libraries, it is imperative to implement efficient tools to analyze the activity landscape of compound data sets ra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 16 publications
(10 citation statements)
references
References 39 publications
0
2
0
Order By: Relevance
“…Most of them are based on the eSIM 12,13 and eSALI frameworks. 11 A few concepts of eSIM and eSALI need to be explained for better understanding of the data splitting methods.…”
Section: Data Splitting Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Most of them are based on the eSIM 12,13 and eSALI frameworks. 11 A few concepts of eSIM and eSALI need to be explained for better understanding of the data splitting methods.…”
Section: Data Splitting Methodsmentioning
confidence: 99%
“…13 The eSALI corresponds to a quantitative measurement of the activity landscape roughness of a whole set. 11 However, it can also be used as a loss function to pick molecule from a set. When maximizing eSALI, presence of activity cliffs will be favored.…”
Section: Data Splitting Methodsmentioning
confidence: 99%
“…The notion of calculating average similarity values over a set of molecules has proven to be particularly powerful over several cheminformatics tasks. For example, extended similarity has been applied to several problems like diversity selection, 37,38 molecular dynamics simulations, 39,40 library diversity studies, 41–43 activity cliffs, 44 descriptor selection for the QSAR/QSPR model, 45 fingerprint evaluations, 46 and chemical space visualization. 47 However, despite these advantages, there are some drawbacks like the need for a coincidence threshold analysis to determine the best similarity/dissimilarity separation and a different numeric value than the pairwise comparisons.…”
Section: Introductionmentioning
confidence: 99%
“…Most importantly, the usage of extended similarities displays a much more advantageous, linear scaling of computational demand vs. library size, during common tasks such as diversity selection or clustering. 14 Recently, the various flavors of extended similarity metrics have been implemented into several workflows, including data visualization, 17 activity cliff-detection, 18 or molecular dynamics trajectory sampling. 19 Besides their many advantages, extended similarities require the optimization of several parameters that are absent from the traditional pairwise definitions.…”
Section: Introductionmentioning
confidence: 99%