2021
DOI: 10.33774/chemrxiv-2021-v2pnn
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Chemformer: A Pre-Trained Transformer for Computational Chemistry

Abstract: Transformer models coupled with Simplified Molecular Line Entry System (SMILES) have recently proven to be a powerful combination for solving challenges in cheminformatics. These models, however, are often developed specifically for a single application and can be very resource-intensive to train. In this work we present Chemformer model -a Transformerbased model which can be quickly applied to both sequence-to-sequence and discriminative cheminformatics tasks. Additionally, we show that self-supervised pre-tr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 29 publications
0
5
0
Order By: Relevance
“…Transformers extend this concept by using an ‘attention‐only’ framework that eliminates the need for RNN in sequence‐based tasks [ 62 ]. Recently, this newer approach has also been investigated for molecular optimisation and reaction prediction [ 63 ].…”
Section: The Requirements Of a Fragment Librarymentioning
confidence: 99%
“…Transformers extend this concept by using an ‘attention‐only’ framework that eliminates the need for RNN in sequence‐based tasks [ 62 ]. Recently, this newer approach has also been investigated for molecular optimisation and reaction prediction [ 63 ].…”
Section: The Requirements Of a Fragment Librarymentioning
confidence: 99%
“…Physical chemical properties, such as hydrophobicity (logP), solvent accessible area (SAS), chemical shifts, and scalar couplings, are common predictive tasks for small molecule focused deep learning applications. [34][35][36][37][38] The popularity of these tasks is aided by their many uses in drug discovery, particularly in the lead optimization stage where molecules must be optimized to simultaneously satisfy a number of criteria, such as solubility and bioavailability. Deep learning has also been applied to the prediction of NMR chemical shifts and scalar coupling constants, [39][40][41] collisional cross sections (CSS) in mass spectrometry, [42] and protein properties such as isoelectric point and fluorescence extinction coefficient.…”
Section: Property Predictionmentioning
confidence: 99%
“…The use of deep learning for the "generation" of molecules is one of the more high profile examples, with applications in lead optimization, protein structure prediction, and protein design. For small molecules, popular applications include the prediction of retrosynthetic pathways [38,[48][49][50] and the generation of novel molecules, either from a starting molecule, such as a lead compound, or de novo. [34,51,52] Finally, machine learned potentials have demonstrated that deep learning models can be successfully applied to atomistic simulations.…”
Section: Molecular Generationmentioning
confidence: 99%
See 2 more Smart Citations