2023
DOI: 10.3390/ijms241511948
|View full text |Cite
|
Sign up to set email alerts
|

Molecular Descriptors Property Prediction Using Transformer-Based Approach

Abstract: In this study, we introduce semi-supervised machine learning models designed to predict molecular properties. Our model employs a two-stage approach, involving pre-training and fine-tuning. Particularly, our model leverages a substantial amount of labeled and unlabeled data consisting of SMILES strings, a text representation system for molecules. During the pre-training stage, our model capitalizes on the Masked Language Model, which is widely used in natural language processing, for learning molecular chemica… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(1 citation statement)
references
References 55 publications
0
1
0
Order By: Relevance
“…SMILES is a widely used standard for representing molecular structures as strings of characters that can be easily input into a transformer-based model 65 69 . Furthermore, using SMILES allows for greater flexibility and generalization of the input data because it can capture various molecular structures and properties 66 , 67 , 70 . This makes the transformer-based models more robust and adaptable to new and diverse sets of molecules, which are critical for new drug discovery 71 73 .…”
Section: Resultsmentioning
confidence: 99%
“…SMILES is a widely used standard for representing molecular structures as strings of characters that can be easily input into a transformer-based model 65 69 . Furthermore, using SMILES allows for greater flexibility and generalization of the input data because it can capture various molecular structures and properties 66 , 67 , 70 . This makes the transformer-based models more robust and adaptable to new and diverse sets of molecules, which are critical for new drug discovery 71 73 .…”
Section: Resultsmentioning
confidence: 99%