Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics 2019
DOI: 10.1145/3307339.3342186
|View full text |Cite
|
Sign up to set email alerts
|

Smiles-Bert

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
136
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
10

Relationship

0
10

Authors

Journals

citations
Cited by 249 publications
(137 citation statements)
references
References 22 publications
1
136
0
Order By: Relevance
“…There is a trend towards very large networks that ( perhaps unexpectedly [135]) do not overtrain [55]. The biggest and most successful deep networks, presently GPT-3 [55], use transformer [136] architectures, including in drug discovery [137,138]. The largest flavour of GPT-3 has 96 layers with 12 299 nodes in each.…”
Section: Methods To Improve Generalisationmentioning
confidence: 99%
“…There is a trend towards very large networks that ( perhaps unexpectedly [135]) do not overtrain [55]. The biggest and most successful deep networks, presently GPT-3 [55], use transformer [136] architectures, including in drug discovery [137,138]. The largest flavour of GPT-3 has 96 layers with 12 299 nodes in each.…”
Section: Methods To Improve Generalisationmentioning
confidence: 99%
“…Following the great success of transformer in computer vision and natural language processing domains, several transformer based models have been proposed for efficient chemical representations. Leveraging the capability of transformer as an encoder, it is usually pre-trained on massive unlabeled chemical compounds either in the form of SMILES or molecular graph, which leads to outstanding performances in downstream tasks such as absorption, distribution, and toxicity prediction [101] , [123] , [138] , [139] , [140] . The crucial point of the chemical transformer is fully exploit atom interactions and chemical structure information through self-attention mechanisms.…”
Section: Deep Learning Technologies: How Well Can We Accomplish the T...mentioning
confidence: 99%
“…[29,139].). Quickly, the basic idea of Word2Vec [140], i.e., learning generic embeddings from large corpora to facilitate downstream prediction tasks, was borrowed by Winter et al [141] and later by SMILES-BERT [142] and SMILES Transformer [143]. Without the notion of pretraining, related ideas on obtaining better SMILES embeddings were explored in SMILES2Vec [144] and SMILES-X [145].…”
Section: Molecular Representationsmentioning
confidence: 99%