2022
DOI: 10.1101/2022.05.20.492769
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Alignment-free metal ion-binding site prediction from protein sequence through pretrained language model and multi-task learning

Abstract: More than one-third of the proteins contain metal ions in the Protein Data Bank. Correct identification of metal ion-binding residues is important for understanding protein functions and designing novel drugs. Due to the small size and high versatility of metal ions, it remains challenging to computationally predict their binding sites from protein sequence. Existing sequence-based methods are of low accuracy due to the lack of structural information, and time-consuming owing to the usage of multi-sequence ali… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

3
3

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 53 publications
0
3
0
Order By: Relevance
“…Most of the current prediction methods are data-driven, which hinge on the extraction of features from both amino acid sequences and structures to facilitate the training of predictive models 12 . Some approaches centred around sequences [13][14] , and others exhibit a structural emphasis akin to methods like Fold-X 15 , MIB [16][17] , and BioMetAll 18 . Other methodologies take a dual approach, combining sequence-based and structural information to achieve a more comprehensive analytical standpoint [19][20] .…”
Section: Introductionmentioning
confidence: 99%
“…Most of the current prediction methods are data-driven, which hinge on the extraction of features from both amino acid sequences and structures to facilitate the training of predictive models 12 . Some approaches centred around sequences [13][14] , and others exhibit a structural emphasis akin to methods like Fold-X 15 , MIB [16][17] , and BioMetAll 18 . Other methodologies take a dual approach, combining sequence-based and structural information to achieve a more comprehensive analytical standpoint [19][20] .…”
Section: Introductionmentioning
confidence: 99%
“…For example, transfer learning technique has shown ability to transfer knowledge from related tasks with large‐scale data to the target task with limited data 28 . Our previous works have highlighted the power of pretrained protein language model ProtTrans 29 on metal ion‐binding site prediction 30 …”
Section: Introductionmentioning
confidence: 99%
“…28 Our previous works have highlighted the power of pretrained protein language model ProtTrans 29 on metal ion-binding site prediction. 30 Another limitation of existing sequence-based solubility change prediction is the lack of protein structure information. Deep learning methods have been shown to benefit from protein structure information for protein solubility prediction, 31 stability changes prediction, 32 protein design, 33 protein-protein interaction, 34 and drug discovery.…”
mentioning
confidence: 99%