2020
DOI: 10.1186/s13321-019-0404-1
|View full text |Cite
|
Sign up to set email alerts
|

Mol-CycleGAN: a generative model for molecular optimization

Abstract: Designing a molecule with desired properties is one of the biggest challenges in drug development, as it requires optimization of chemical compound structures with respect to many complex properties. To augment the compound design process we introduce Mol-CycleGAN -a CycleGAN-based model that generates optimized compounds with high structural similarity to the original ones. Namely, given a molecule our model generates a structurally similar one with an optimized value of the considered property. We evaluate t… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
203
0
2

Year Published

2020
2020
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 227 publications
(206 citation statements)
references
References 40 publications
1
203
0
2
Order By: Relevance
“…The literature concerning generative models of molecules has exploded since the first work on the topic Gómez-Bombarelli et al [2018]. Current methods feature molecular representations such as SMILES [Janz et al, 2018, Segler et al, 2017, Skalic et al, 2019, Ertl et al, 2017, Lim et al, 2018, Kang and Cho, 2018, Sattarov et al, 2019, Gupta et al, 2018, Harel and Radinsky, 2018, Yoshikawa et al, 2018, Bjerrum and Sattarov, 2018, Mohammadi et al, 2019 and graphs [Simonovsky and Komodakis, 2018, Li et al, 2018a, De Cao and Kipf, 2018, Kusner et al, 2017, Dai et al, 2018, Samanta et al, 2019, Li et al, 2018b, Kajino, 2019, Jin et al, 2019, Bresson and Laurent, 2019, Lim et al, 2019, Pölsterl and Wachinger, 2019, Krenn et al, 2019, Maziarka et al, 2019, Madhawa et al, 2019, Shen, 2018, Korovina et al, 2019 In this section we conduct an empirical test of the hypothesis from [Gómez-Bombarelli et al, 2018] that the decoder's lack of efficiency is due to data point collection in "dead regions" of the latent space far from the data on which the VAE was trained. We use this information to construct a binary classification Bayesian Neural Network (BNN) to serve as a constraint function that outputs the probability of a latent point being valid, the details of which will be discussed in the section on labelling criteria.…”
Section: Related Workmentioning
confidence: 99%
“…The literature concerning generative models of molecules has exploded since the first work on the topic Gómez-Bombarelli et al [2018]. Current methods feature molecular representations such as SMILES [Janz et al, 2018, Segler et al, 2017, Skalic et al, 2019, Ertl et al, 2017, Lim et al, 2018, Kang and Cho, 2018, Sattarov et al, 2019, Gupta et al, 2018, Harel and Radinsky, 2018, Yoshikawa et al, 2018, Bjerrum and Sattarov, 2018, Mohammadi et al, 2019 and graphs [Simonovsky and Komodakis, 2018, Li et al, 2018a, De Cao and Kipf, 2018, Kusner et al, 2017, Dai et al, 2018, Samanta et al, 2019, Li et al, 2018b, Kajino, 2019, Jin et al, 2019, Bresson and Laurent, 2019, Lim et al, 2019, Pölsterl and Wachinger, 2019, Krenn et al, 2019, Maziarka et al, 2019, Madhawa et al, 2019, Shen, 2018, Korovina et al, 2019 In this section we conduct an empirical test of the hypothesis from [Gómez-Bombarelli et al, 2018] that the decoder's lack of efficiency is due to data point collection in "dead regions" of the latent space far from the data on which the VAE was trained. We use this information to construct a binary classification Bayesian Neural Network (BNN) to serve as a constraint function that outputs the probability of a latent point being valid, the details of which will be discussed in the section on labelling criteria.…”
Section: Related Workmentioning
confidence: 99%
“…Likewise, Maziarka et al [44] implemented a deep learning GAN architecture called the Mol-CycleGAN structure to produce optimized molecular compounds where their molecular structures were highly similar to the original ones. It should be emphasized that both the generative and discriminative network modules in the Mol-CycleGAN structure directly performed with latent vectors, and then the latent vectors were translated back to chemical structures (represented as molecular graphs).…”
Section: Molecular De Novo Designmentioning
confidence: 99%
“…CycleGAN provides unpaired image-to-image translation using Cycle-Consistent Adversarial Networks (Zhu et al, 2017 ). MolCycleGAN, which extended the CycleGAN framework with an added loss and extra encoding network, maps from distribution to distribution on unpaired samples, so it can amplify the size of our dataset in the process by taking all of the pairing combinations rather than relying on a training dataset of predefined molecule-inhibitor pairs (Maziarka et al, 2020 ). The advantage of MolCycleGAN is the ability to learn transformation rules from the sets of compounds with desired and undesired values of the considered property.…”
Section: The Rise Of the Machines: Allosteric Mechanisms Through The mentioning
confidence: 99%