Predicting phosphorylation sites using machine learning by integrating the sequence, structure, and functional information of proteins

Jamal, Salma; Ali, Waseem; Nagpal, Priya; Grover, Abhinav; Grover, Sonam

doi:10.1186/s12967-021-02851-0

Cited by 23 publications

(17 citation statements)

References 56 publications

(74 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Among classification machine learning methods that included SVM, DT and RF, RF had the best performance in classifying subjects on their MetS outcomes as indicated by the highest accuracy (0.743) as well as area under the receiver operating characteristic curve (AU-ROC) (0.804) and AUC-PR (0.776). This result is similar to the findings in the study by Szabo et al that applied the Random Forest algorithm for a similar task and calculated the accuracy of this method to be 71.4% [ 46 – 48 ]. Worachartcheewan et al also implemented a Random Forest model to predict MetS in the Bangkok population and identify the most influential predictors.…”

Section: Discussionsupporting

confidence: 90%

Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study

et al. 2022

View full text Add to dashboard Cite

Background Metabolic syndrome (MetS) is a prevalent multifactorial disorder that can increase the risk of developing diabetes, cardiovascular diseases, and cancer. We aimed to compare different machine learning classification methods in predicting metabolic syndrome status as well as identifying influential genetic or environmental risk factors. Methods This candidate gene study was conducted on 4756 eligible participants from the Tehran Cardio-metabolic Genetic study (TCGS). We compared predictive models using logistic regression (LR), Random Forest (RF), decision tree (DT), support vector machines (SVM), and discriminant analyses. Demographic and clinical features, as well as variables regarding common GCKR gene polymorphisms, were included in the models. We used a 10-repeated tenfold cross-validation to evaluate model performance. Results 50.6% of participants had MetS. MetS was significantly associated with age, gender, schooling years, BMI, physical activity, rs780094, and rs780093 (P < 0.05) as indicated by LR. RF showed the best performance overall (AUC-ROC = 0.804, AUC-PR = 0.776, and Accuracy = 0.743) and indicated BMI, physical activity, and age to be the most influential model features. According to the DT, a person with BMI < 24 and physical activity < 8.8 possesses a 4% chance for MetS. In contrast, a person with BMI ≥ 25, physical activity < 2.7, and age ≥ 33, has 77% probability of suffering from MetS. Conclusion Our findings indicated that, on average, machine learning models outperformed conventional statistical approaches for patient classification. These well-performing models may be used to develop future support systems that use a variety of data sources to identify persons at high risk of getting MetS.

show abstract

Section: Discussionsupporting

confidence: 90%

Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study

et al. 2022

View full text Add to dashboard Cite

show abstract

“…To identify protein kinase-substrate relationships within the clusters ( Jamal et al., 2021 ), we performed substrate motif analysis for each temporal cluster of DEpP ( Figure S17 ). In the P. increase and decrease clusters , the recognition motifs for phosphorylation by PKA, PKC, casein kinase I, GSK3 and Ca2+/calmodulin-dependent protein kinase 2 (CAMK2) family members were noted.…”

Section: Resultsmentioning

confidence: 99%

Shared and unique phosphoproteomics responses in skeletal muscle from exercise models and in hyperammonemic myotubes

et al. 2022

View full text Add to dashboard Cite

“…The phosphorylation is a type of posttranslational modification of proteins that regulates various aspects of their functionalities [60,61]. Protein phosphorylation plays a key role in cell signaling, gene expression, and differentiation [62,63].…”

Section: Introductionmentioning

confidence: 99%

Comparative Phosphoproteomics of Neuro-2a Cells under Insulin Resistance Reveals New Molecular Signatures of Alzheimer’s Disease

Kim

et al. 2022

IJMS

View full text Add to dashboard Cite

Insulin in the brain is a well-known critical factor in neuro-development and regulation of adult neurogenesis in the hippocampus. The abnormality of brain insulin signaling is associated with the aging process and altered brain plasticity, and could promote neurodegeneration in the late stage of Alzheimer’s disease (AD). The precise molecular mechanism of the relationship between insulin resistance and AD remains unclear. The development of phosphoproteomics has advanced our knowledge of phosphorylation-mediated signaling networks and could elucidate the molecular mechanisms of certain pathological conditions. Here, we applied a reliable phosphoproteomic approach to Neuro2a (N2a) cells to identify their molecular features under two different insulin-resistant conditions with clinical relevance: inflammation and dyslipidemia. Despite significant difference in overall phosphoproteome profiles, we found molecular signatures and biological pathways in common between two insulin-resistant conditions. These include the integrin and adenosine monophosphate-activated protein kinase pathways, and we further verified these molecular targets by subsequent biochemical analysis. Among them, the phosphorylation levels of acetyl-CoA carboxylase and Src were reduced in the brain from rodent AD model 5xFAD mice. This study provides new molecular signatures for insulin resistance in N2a cells and possible links between the molecular features of insulin resistance and AD.

show abstract

Predicting phosphorylation sites using machine learning by integrating the sequence, structure, and functional information of proteins

Cited by 23 publications

References 56 publications

Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study

Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study

Shared and unique phosphoproteomics responses in skeletal muscle from exercise models and in hyperammonemic myotubes

Comparative Phosphoproteomics of Neuro-2a Cells under Insulin Resistance Reveals New Molecular Signatures of Alzheimer’s Disease

Contact Info

Product

Resources

About