2022
DOI: 10.3389/fgene.2022.885929
|View full text |Cite
|
Sign up to set email alerts
|

ProtTrans-Glutar: Incorporating Features From Pre-trained Transformer-Based Models for Predicting Glutarylation Sites

Abstract: Lysine glutarylation is a post-translational modification (PTM) that plays a regulatory role in various physiological and biological processes. Identifying glutarylated peptides using proteomic techniques is expensive and time-consuming. Therefore, developing computational models and predictors can prove useful for rapid identification of glutarylation. In this study, we propose a model called ProtTrans-Glutar to classify a protein sequence into positive or negative glutarylation site by combining traditional … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 31 publications
0
4
0
Order By: Relevance
“…Self-supervised training endows the latent variables of the model with highly informative features, known as learned representations, which can then be leveraged in downstream tasks where limited training data is available. Learned protein representations are currently central to the state-of-the-art tools for predicting variant fitness [3][4][5][6], protein function [7,8], subcellular localisation [9], solubility [10], binding sites [11], signal peptides [12], post-translational modifications [13], intrinsic disorder [14], and others [15,16], and they have shown promise in the path towards accurate alignment-free protein structure prediction [17][18][19][20][21]. Improving learned representations is therefore a potential path to deliver consistent, substantial improvements across computational protein engineering.…”
Section: Introductionmentioning
confidence: 99%
“…Self-supervised training endows the latent variables of the model with highly informative features, known as learned representations, which can then be leveraged in downstream tasks where limited training data is available. Learned protein representations are currently central to the state-of-the-art tools for predicting variant fitness [3][4][5][6], protein function [7,8], subcellular localisation [9], solubility [10], binding sites [11], signal peptides [12], post-translational modifications [13], intrinsic disorder [14], and others [15,16], and they have shown promise in the path towards accurate alignment-free protein structure prediction [17][18][19][20][21]. Improving learned representations is therefore a potential path to deliver consistent, substantial improvements across computational protein engineering.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, iGluK-Deep [14] was proposed based on deep neural networks and Chou's Pseudo Amino Acid Composition (PseAAC). ProtTrans-Glutar [15] incorporated the XGBoost and pre-trained features using the transformer. Details of these studies are summarized in Table 1.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, iGluK-Deep [ 15 ] was proposed based on deep neural networks and Chou’s Pseudo Amino Acid Composition (PseAAC). ProtTrans-Glutar [ 16 ] incorporated the XGBoost and pre-trained features by Transformer. DeepDN_iGlu [ 17 ] was proposed by employing binary encoding as feature representation, using DenseNet as the classification model, and utilizing the focal loss function to address the imbalance issue.…”
Section: Introductionmentioning
confidence: 99%