2020
DOI: 10.26434/chemrxiv.12152661.v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Chemical Space Exploration: How Genetic Algorithms Find the Needle in the Haystack

Abstract: We attempt to explain why search algorithms can find molecules with particular properties in an enormous chemical space (ca 10 60 molecules) by considering only a tiny subset (typically 10 3−6 molecules). Using a very simple example, we show that the number of potential paths that the search algorithms can follow to the target is equally vast. Thus, the probability of randomly finding a molecule that is on one of these paths is quite high and from here a search algorithm can follow the path to the target molec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
23
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(23 citation statements)
references
References 0 publications
0
23
0
Order By: Relevance
“…The evolutionary approach on molecular graphs is an efficient method to find needles in a haystack [ 46 ]. Thanks to our sequential and atom centred process, we can also visualise the progression of the exploration, allowing a better interpretation of the results (see Fig.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The evolutionary approach on molecular graphs is an efficient method to find needles in a haystack [ 46 ]. Thanks to our sequential and atom centred process, we can also visualise the progression of the exploration, allowing a better interpretation of the results (see Fig.…”
Section: Resultsmentioning
confidence: 99%
“…One can also note the generally positive effect of secondary actions that worked with half as many steps. Inspired by Henault et al in their exploration of the chemical space with a genetic algorithm, we can look more precisely at our efficiency in the rediscovery tasks [ 46 ]. In Table 4 we note that the secondary actions have a huge impact on the number of calls to the evaluation function in order to find the target.…”
Section: Resultsmentioning
confidence: 99%
“…In addition to recent ML-based approaches, various genetic algorithm (GA)-based molecular property optimization algorithms have been developed [25,26,27,28,29,30,31,32,33]. The main advantage of GA-based algorithms is that they do not require a large amount of molecule data relevant to a given optimization task because they search novel molecules in a combinatorial and stochastic way.…”
Section: Introductionmentioning
confidence: 99%
“…Most existing GA-based molecular optimization algorithms are based on the graph representation of a molecule. In recent studies, they showed competitive, sometimes better, performance compared to ML-based methods in generating novel molecules with desired properties [26,28,29,25]. A GA-based method using the graph representation requires careful design of crossover and/or mutation operations of graphs, which may bias the direction and extension of chemical space search.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation