2023
DOI: 10.1093/bioinformatics/btad046
|View full text |Cite
|
Sign up to set email alerts
|

Phosformer: an explainable transformer model for protein kinase-specific phosphorylation predictions

Abstract: Motivation The human genome encodes over 500 distinct protein kinases which regulate nearly all cellular processes by the specific phosphorylation of protein substrates. While advances in mass spectrometry and proteomics studies have identified thousands of phosphorylation sites across species, information on the specific kinases that phosphorylate these sites is currently lacking for the vast majority of phosphosites. Recently, there has been a major focus on the development of computational… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 40 publications
0
11
0
Order By: Relevance
“…Considering the ever increasing model complexity, XAI has started to gain traction in the field of protein analysis too ( Upmeier zu Belzen et al 2019 , Taujale et al 2021 , Vig et al 2021 , Hou et al 2023 , Vu et al 2023 , Zhou et al 2023 ), but quantitative evidence for its applicability beyond single examples was lacking up to now. We provide statistical evidence for the alignment of attribution maps with corresponding sequence annotations, both on the embedding level as well as for specific heads inside of the model architecture, which led to the identification of specialized heads for specific protein function prediction tasks.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Considering the ever increasing model complexity, XAI has started to gain traction in the field of protein analysis too ( Upmeier zu Belzen et al 2019 , Taujale et al 2021 , Vig et al 2021 , Hou et al 2023 , Vu et al 2023 , Zhou et al 2023 ), but quantitative evidence for its applicability beyond single examples was lacking up to now. We provide statistical evidence for the alignment of attribution maps with corresponding sequence annotations, both on the embedding level as well as for specific heads inside of the model architecture, which led to the identification of specialized heads for specific protein function prediction tasks.…”
Section: Discussionmentioning
confidence: 99%
“…Explainability methods have been employed in NLP too ( Arras et al 2019 , Manning et al 2020 , Chefer et al 2021 , Pascual et al 2021 ). Moreover, researchers have started to explore using explainability methods in the area of protein function prediction ( Upmeier zu Belzen et al 2019 , Taujale et al 2021 , Vig et al 2021 , Hou et al 2023 , Vu et al 2023 , Zhou et al 2023 ).…”
Section: Introductionmentioning
confidence: 99%
“…Furthermore, applying results in the developing field of Ex-plainable AI (XAI) and Machine Learning (XML) (46,47), to BioLMTox may lead to more biologically interpretable results, specifically in computationally determining mechanisms of protein toxicity. These methods may improve toxin domain and motif characterization and allow comparing model explanations that can be validated with experimental observations.…”
Section: Computational Efficiency Andmentioning
confidence: 99%
“…Because of the transient nature of kinase–substrate interactions and the cost and time associated with the experimental characterization of kinase substrates, there has been considerable effort in the development of machine learning models for kinase–substrate prediction ( Wang et al 2017 , Luo et al 2019 , Yang et al 2021 , Kirchoff and Gomez 2022 , Zhou et al 2023 ). In general, these models are trained on known phosphosites in databases such as PhosphoSitePlus ( Hornbeck et al 2012 ), then used to predict new kinase–substrate associations.…”
Section: Introductionmentioning
confidence: 99%
“…Most often, these models lack negative data ( Hornbeck et al 2012 ), a crucial element for reducing the incidence of false-positive predictions. The second challenge stems from the model’s design and training objectives, with many models confined to making predictions on a restricted set of well-studied kinases ( Wang et al 2017 , Luo et al 2019 , Yang et al 2021 , Kirchoff and Gomez 2022 , Zhou et al 2023 ), limiting their application in discovering novel kinase–substrate interactions for understudied kinases. Therefore, creating a kinase–substrate phosphorylation predictor that assimilates both experimentally validated positive and negative data into one framework with the capability of generalizing beyond the seen kinase–substrate pairs could significantly augment the model’s pattern recognition ability and enhance its capacity for zero-shot predictions on new kinases.…”
Section: Introductionmentioning
confidence: 99%