2022
DOI: 10.1186/s13321-022-00599-3
|View full text |Cite
|
Sign up to set email alerts
|

Transformer-based molecular optimization beyond matched molecular pairs

Abstract: Molecular optimization aims to improve the drug profile of a starting molecule. It is a fundamental problem in drug discovery but challenging due to (i) the requirement of simultaneous optimization of multiple properties and (ii) the large chemical space to explore. Recently, deep learning methods have been proposed to solve this task by mimicking the chemist’s intuition in terms of matched molecular pairs (MMPs). Although MMPs is a widely used strategy by medicinal chemists, it offers limited capability in te… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
48
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 39 publications
(67 citation statements)
references
References 32 publications
1
48
0
Order By: Relevance
“…It is known that transformer models can learn general rules of composition of organic molecules without explicitly using information on their stability; ,, here, we extend such an approach to generating biologically active molecules. As examples of alternatives, we can mention ref where a separate LSTM model was trained to prioritize SMILES coming from a generative model, and refs , , , and where transformer models were trained to optimize computable properties (logD, solubility, clearance, or logP), but not biological activity of generated molecules against a given protein target. Note also that transfer learning or fine-tuning of a generative model on molecules active against the given target is not necessary, as we show in this work.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…It is known that transformer models can learn general rules of composition of organic molecules without explicitly using information on their stability; ,, here, we extend such an approach to generating biologically active molecules. As examples of alternatives, we can mention ref where a separate LSTM model was trained to prioritize SMILES coming from a generative model, and refs , , , and where transformer models were trained to optimize computable properties (logD, solubility, clearance, or logP), but not biological activity of generated molecules against a given protein target. Note also that transfer learning or fine-tuning of a generative model on molecules active against the given target is not necessary, as we show in this work.…”
Section: Discussionmentioning
confidence: 99%
“…Machine learning (ML) has been actively used for drug design (for recent reviews, see refs ). However, applications of ML specifically to hit expansion have shown limited demonstrated success. A question arises whether this practice could be improved, and if so, what kind of ML models could better serve this purpose.…”
Section: Introductionmentioning
confidence: 99%
“…Following the data preparation method of He et al, 16 the CHEMBL 24 dataset was first standardized using MolVS 25 and subsequently cleaned by implementing operations such as removing complexes, removing hydrogen atoms, disconnecting metals, running reionizer acid, applying normalization rules and keeping stereo structures. Cumming et al 26 summarized the filters used to find high quality compounds, such as ''rule of five'', AZFilters, quantitative estimate of drug-likeness (QED), etc.…”
Section: Mmp Data Collectionmentioning
confidence: 99%
“…15 The major difference with MMPA is that by adding property constraints to the input of the translation model, targeted modifications to the starting molecule, such as lowering the log D or increasing the clearance rate, can be achieved. [16][17][18] However, relative constraints (optimized properties -original properties) in the previous method may increase the learning difficulty of the model and reduce its optimization performance. Based on the above observations, we believe that the development of deep learning-based molecular optimization tools needs to address two aspects: (1) the development of higher accuracy ADMET prediction models; (2) the development of condition-generation models that satisfy multiple constraints simultaneously.…”
Section: Introductionmentioning
confidence: 99%
“…[108,109] MMPs are widely used in drug discovery and medicinal chemistry as they facilitate fast and easy understanding of structure-activity relationships. [110][111][112] Counterfactual and MMP analysis intersect if the structural change is associated with a drastic change in the properties. These MMPs are then counterfactual pairs.…”
Section: Similarity To Adjacent Fieldsmentioning
confidence: 99%