2022
DOI: 10.1038/s41598-022-21366-2
|View full text |Cite
|
Sign up to set email alerts
|

Improving protein succinylation sites prediction using embeddings from protein language model

Abstract: Protein succinylation is an important post-translational modification (PTM) responsible for many vital metabolic activities in cells, including cellular respiration, regulation, and repair. Here, we present a novel approach that combines features from supervised word embedding with embedding from a protein language model called ProtT5-XL-UniRef50 (hereafter termed, ProtT5) in a deep learning framework to predict protein succinylation sites. To our knowledge, this is one of the first attempts to employ embeddin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
37
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 42 publications
(38 citation statements)
references
References 43 publications
(53 reference statements)
1
37
0
Order By: Relevance
“…In that regard, amino acids (words) can be represented by dense vectors using word embeddings where a vector represents the projection of the amino acid into a continuous vector space. We used Keras’s embedding layer [ 30 ], as in LMSuccSite [ 27 ], to implement supervised word embedding where the embedding is learned as a part of training a deep learning model. The process of parameter learning in this approach is supervised; the parameters are updated with subsequent layers during the learning process under the supervision of a label.…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…In that regard, amino acids (words) can be represented by dense vectors using word embeddings where a vector represents the projection of the amino acid into a continuous vector space. We used Keras’s embedding layer [ 30 ], as in LMSuccSite [ 27 ], to implement supervised word embedding where the embedding is learned as a part of training a deep learning model. The process of parameter learning in this approach is supervised; the parameters are updated with subsequent layers during the learning process under the supervision of a label.…”
Section: Methodsmentioning
confidence: 99%
“…Recently, these embeddings have been shown to be beneficial in various structural bioinformatics tasks including but not limited to secondary structure prediction and subcellular location, among others. In that regard, in this work, we use pLM ProtT5 [ 21 , 27 ] as a static feature encoders to extract per residue embeddings for protein sequences for which we are predicting S-nitrosylation sites. It is relevant to note that the input to ProtT5 is the overall protein sequence.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations