2021
DOI: 10.3390/molecules26175291
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Identification of Analogue Series from Large Compound Data Sets: Methods and Applications

Abstract: Analogue series play a key role in drug discovery. They arise naturally in lead optimization efforts where analogues are explored based on one or a few core structures. However, it is much harder to accurately identify and extract pairs or series of analogue molecules in large compound databases with no predefined core structures. This methodological review outlines the most common and recent methodological developments to automatically identify analogue series in large libraries. Initial approaches focused on… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 10 publications
(10 citation statements)
references
References 122 publications
(217 reference statements)
0
10
0
Order By: Relevance
“…Generating automatically and consistently the main scaffold or core structure of large data sets can be done in several ways as recently reviewed [24] . In general, it is desirable to generate the scaffolds rapidly, consistently, and interpretable, in particular for an organic or medicinal chemist working on chemical synthesis.…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…Generating automatically and consistently the main scaffold or core structure of large data sets can be done in several ways as recently reviewed [24] . In general, it is desirable to generate the scaffolds rapidly, consistently, and interpretable, in particular for an organic or medicinal chemist working on chemical synthesis.…”
Section: Resultsmentioning
confidence: 99%
“…[14] They are helpful at concisely depicting the structure-activity (property) relationships -SA(P)R -in a summarized representation of the data set, as analog series can be represented in fewer data points than individual compounds. [23,24] Of note, only a fraction of the total data is presented in the constellation plot: the compounds forming analog series; in this case, we included only analog series consisting of at least three compounds. Recently, constellation plots have been used to describe a library of antidiabetic natural products [25] and a collection of tubulin inhibitors.…”
Section: Constellation Plotsmentioning
confidence: 99%
See 2 more Smart Citations
“…This way, the scaffolds extracted as representatives of an analog series take synthetic accessibility into account, an important aspect in medicinal chemistry but mostly ignored in the approaches above. The analog series and their representative scaffolds can be visualised by R-group tables, mapped into coordinate-based chemical space [32], annotated with activity information to support SAR studies, or used to extract favourable lead structures for drug design campaigns [17,18,[33][34][35][36][37][38].…”
Section: Scaffold Approaches In Cheminformaticsmentioning
confidence: 99%