DeepRMethylSite: a deep learning based approach for prediction of arginine methylation sites in proteins

Chaudhari, Meenal; Thapa, Niraj; Newman, Robert H.; Saigo, Hiroto; Dukka, B Kc

doi:10.1039/d0mo00025f

Cited by 26 publications

(28 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, Alanine (A) is represented as 100000000000000000000, Arginine (R) is represented as 010000000000000000000, and so on. However, PTM classification models such as DeepSuccinylSite 12 , DL-Malosite 13 , and DeepRMethylSite 14 implemented an embedded encoding scheme 15 with better performance metrics than one-hot encoding. In this study, we used an embedding layer for the encoding of protein sequences.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites

Thapa

Chaudhari

Iannetta

et al. 2021

Sci Rep

Self Cite

View full text Add to dashboard Cite

Protein phosphorylation, which is one of the most important post-translational modifications (PTMs), is involved in regulating myriad cellular processes. Herein, we present a novel deep learning based approach for organism-specific protein phosphorylation site prediction in Chlamydomonas reinhardtii, a model algal phototroph. An ensemble model combining convolutional neural networks and long short-term memory (LSTM) achieves the best performance in predicting phosphorylation sites in C. reinhardtii. Deemed Chlamy-EnPhosSite, the measured best AUC and MCC are 0.90 and 0.64 respectively for a combined dataset of serine (S) and threonine (T) in independent testing higher than those measures for other predictors. When applied to the entire C. reinhardtii proteome (totaling 1,809,304 S and T sites), Chlamy-EnPhosSite yielded 499,411 phosphorylated sites with a cut-off value of 0.5 and 237,949 phosphorylated sites with a cut-off value of 0.7. These predictions were compared to an experimental dataset of phosphosites identified by liquid chromatography-tandem mass spectrometry (LC–MS/MS) in a blinded study and approximately 89.69% of 2,663 C. reinhardtii S and T phosphorylation sites were successfully predicted by Chlamy-EnPhosSite at a probability cut-off of 0.5 and 76.83% of sites were successfully identified at a more stringent 0.7 cut-off. Interestingly, Chlamy-EnPhosSite also successfully predicted experimentally confirmed phosphorylation sites in a protein sequence (e.g., RPS6 S245) which did not appear in the training dataset, highlighting prediction accuracy and the power of leveraging predictions to identify biologically relevant PTM sites. These results demonstrate that our method represents a robust and complementary technique for high-throughput phosphorylation site prediction in C. reinhardtii. It has potential to serve as a useful tool to the community. Chlamy-EnPhosSite will contribute to the understanding of how protein phosphorylation influences various biological processes in this important model microalga.

show abstract

Section: Methodsmentioning

confidence: 99%

“…Both MusiteDeep and DeepPhos employ binary encoding, which is static in nature. Our previous DL-based predictors for succinylation 12 , malonylation 13 , and methylation 14 instead utilize embedding 15 for encoding, demonstrating significantly improved model performance compared to binary encoding.…”

Section: Introductionmentioning

confidence: 99%

A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites

Thapa

Chaudhari

Iannetta

et al. 2021

Sci Rep

Self Cite

View full text Add to dashboard Cite

show abstract

“…Any redundant sequences within and between the positive and negative sites were removed to obtain a non-redundant set. Similar to our previous studies ( Chaudhari et al, 2020 ; Thapa et al, 2020 ), we used an under-sampling strategy to balance the dataset, which had more negative sites than positive sites prior to balancing ( Aridas GLitaFNaCK, 2017 ). Under-sampling allows random selection of negative sequences to make the number of negative sites equal to the number of positive sequences, thus balancing the dataset.…”

Section: Methodsmentioning

confidence: 99%

DTL-DephosSite: Deep Transfer Learning Based Approach to Predict Dephosphorylation Sites

Chaudhari

Thapa

Ismail

et al. 2021

Front. Cell Dev. Biol.

Self Cite

View full text Add to dashboard Cite

Phosphorylation, which is mediated by protein kinases and opposed by protein phosphatases, is an important post-translational modification that regulates many cellular processes, including cellular metabolism, cell migration, and cell division. Due to its essential role in cellular physiology, a great deal of attention has been devoted to identifying sites of phosphorylation on cellular proteins and understanding how modification of these sites affects their cellular functions. This has led to the development of several computational methods designed to predict sites of phosphorylation based on a protein’s primary amino acid sequence. In contrast, much less attention has been paid to dephosphorylation and its role in regulating the phosphorylation status of proteins inside cells. Indeed, to date, dephosphorylation site prediction tools have been restricted to a few tyrosine phosphatases. To fill this knowledge gap, we have employed a transfer learning strategy to develop a deep learning-based model to predict sites that are likely to be dephosphorylated. Based on independent test results, our model, which we termed DTL-DephosSite, achieved efficiency scores for phosphoserine/phosphothreonine residues of 84%, 84% and 0.68 with respect to sensitivity (SN), specificity (SP) and Matthew’s correlation coefficient (MCC). Similarly, DTL-DephosSite exhibited efficiency scores of 75%, 88% and 0.64 for phosphotyrosine residues with respect to SN, SP, and MCC.

show abstract

“…In recent years, deep learning (DL) based methods have been used to predict the PTM sites in cellular proteins. Typical applications include DeepSuccinylSite [54], MusiteDeep [55], DeepRMethylSite [56], and DeepPhos [57]. In DL, a suitable raw vector is given to the architecture and transformed into highly abstract features by propagating through whole model.…”

Section: Introductionmentioning

confidence: 99%

RecSNO: Prediction of Protein S-Nitrosylation Sites Using a Recurrent Neural Network

et al. 2021

View full text Add to dashboard Cite

DeepRMethylSite: a deep learning based approach for prediction of arginine methylation sites in proteins

Abstract: DeepRMethylSite is an ensemble-based deep learning model that takes protein sequences as input and predicts sites of Arginine methylation. The implementation and source code are provided at https://github.com/dukkakc/DeepRMethylSite.

Cited by 26 publications

References 29 publications

A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites

A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites

DTL-DephosSite: Deep Transfer Learning Based Approach to Predict Dephosphorylation Sites

RecSNO: Prediction of Protein S-Nitrosylation Sites Using a Recurrent Neural Network

Contact Info

Product

Resources

About