2021
DOI: 10.1038/s41598-021-03431-4
|View full text |Cite
|
Sign up to set email alerts
|

Protein embeddings and deep learning predict binding residues for various ligand classes

Abstract: One important aspect of protein function is the binding of proteins to ligands, including small molecules, metal ions, and macromolecules such as DNA or RNA. Despite decades of experimental progress many binding sites remain obscure. Here, we proposed bindEmbed21, a method predicting whether a protein residue binds to metal ions, nucleic acids, or small molecules. The Artificial Intelligence (AI)-based method exclusively uses embeddings from the Transformer-based protein Language Model (pLM) ProtT5 as input. U… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
102
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
7
2

Relationship

5
4

Authors

Journals

citations
Cited by 88 publications
(102 citation statements)
references
References 56 publications
0
102
0
Order By: Relevance
“…by using the output of the last hidden layers of the networks forming the pLMs, yields a representation of protein sequences referred to as embeddings (Figure 1 in ( 37 )). Embeddings have been used successfully as exclusive input to predicting secondary structure and subcellular localization at performance levels almost reaching ( 38–40 ) or even exceeding ( 37 , 46 , 47 ) the SOTA using evolutionary information from MSAs as input. Embeddings can even substitute sequence similarity for homology-based annotation transfer ( 48 , 49 ).…”
Section: Introductionmentioning
confidence: 99%
“…by using the output of the last hidden layers of the networks forming the pLMs, yields a representation of protein sequences referred to as embeddings (Figure 1 in ( 37 )). Embeddings have been used successfully as exclusive input to predicting secondary structure and subcellular localization at performance levels almost reaching ( 38–40 ) or even exceeding ( 37 , 46 , 47 ) the SOTA using evolutionary information from MSAs as input. Embeddings can even substitute sequence similarity for homology-based annotation transfer ( 48 , 49 ).…”
Section: Introductionmentioning
confidence: 99%
“…The bindEmbed21 approach combines homology-based inference with ML to predict whether a protein residue binds to a metal ion, a nucleic acid, or a small molecule [ 95 ]. The ML component used protein embeddings as inputs to a two-layer convolutional neural network (CNN).…”
Section: Ai Methods Applied To Metalloproteinsmentioning
confidence: 99%
“…predicting CATH (62) classes (25), LambdaPP will be updated to extend the breadth of pLM-based predictions offered. All feature prediction methods integrated into the LambdaPP webserver are currently based on ProtT5 embeddings (23) as methods trained on ProtT5 have, so far in our hands, outperformed those trained on ESM-1b (54) and others (4; 9; 24; 23) for numerous different applications (37-39; 42; 64; 13; 25; 72). This consistency also increases speed as the generation of embeddings has become a limiting step.…”
Section: Introductionmentioning
confidence: 91%
“…Protein language models (pLMs) are deep learning models pre-trained on large sets of unannotated sequences to generate numerical representations (embeddings) (9; 24; 40; 23; 49; 54). Embeddings from pLMs have been successfully used as input to downstream protein prediction tools (78; 37-39; 41; 42; 44; 64; 14; 25; 27; 63; 72). While some pLM-based methods do not reach the performance of MSA-based methods (23; 38; 72), others exceed those (5; 10; 23; 39; 42; 64; 28; 29; 36).…”
Section: Introductionmentioning
confidence: 99%