APLpred: A machine learning-based tool for accurate prediction and characterization of asparagine peptide lyases using sequence-derived optimal features

Malik, Adeel; Kamli, Majid Rasool; Sabir, Jamal S.M.; Rather, Irfan A.; Phan, Le Thi; Kim, Chang-Bae; Manavalan, Balachandran

doi:10.1016/j.ymeth.2024.05.014

Methods

2024

DOI: 10.1016/j.ymeth.2024.05.014

|View full text |Cite

APLpred: A machine learning-based tool for accurate prediction and characterization of asparagine peptide lyases using sequence-derived optimal features

Adeel Malik,

Majid Rasool Kamli,

Jamal S.M. Sabir

et al.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 53 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

GPpred: A Novel Sequence-Based Tool for Predicting Glutamic Proteases Using Optimized Hybrid Encodings

Firoz,

Malik,

Mahajan

et al. 2024

Catalysts

View full text Add to dashboard Cite

Glutamic proteases (GPs) represent one of the seven peptidase families described in the MEROPS database of peptidases (also known as proteases, proteinases, and proteolytic enzymes). Currently, the GP family is divided into six sub-families (G1–G6) distributed across three clans (GA, GB, and GC). A glutamic acid and another variable amino acid are the catalytic residues in this family. Members of the GP family are involved in a wide variety of biological functions. For example, they act as bacterial and plant pathogens, and are involved in cancer and celiac disease. These enzymes are considered potential drug targets given their crucial roles in numerous biological processes. Characterizing GPs provides insights into their structure–function relationships, enabling the design of specific inhibitors or modulators. Such advancements directly contribute to drug discovery by identifying novel therapeutic targets and guiding the development of potent and selective drugs for various diseases, including cancers and autoimmune disorders. To address the challenges associated with labor-intensive experimental methods, we developed GPpred, an innovative support vector machine (SVM)-based predictor to identify GPs from their primary sequences. The workflow involves systematically extracting six distinct feature sets from primary sequences, and optimization using a recursive feature elimination (RFE) algorithm to identify the most informative hybrid encodings. These optimized encodings were then used to evaluate multiple machine learning classifiers, including K-Nearest Neighbors (KNNs), Random Forest (RF), Naïve Bayes (NB), and SVM. Among these, the SVM demonstrated a consistent performance, with an accuracy of 97% during the cross-validation and independent validation. Computational methods like GPpred accelerate this process by analyzing large datasets, predicting potential enzyme targets, and prioritizing candidates for experimental validation, thereby significantly reducing time and costs. GPpred will be a valuable tool for discovering GPs from large datasets, and facilitating drug discovery efforts by narrowing down viable therapeutic candidates.

show abstract

GPpred: A Novel Sequence-Based Tool for Predicting Glutamic Proteases Using Optimized Hybrid Encodings

Firoz,

Malik,

Mahajan

et al. 2024

Catalysts

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

APLpred: A machine learning-based tool for accurate prediction and characterization of asparagine peptide lyases using sequence-derived optimal features

Cited by 1 publication

References 53 publications

GPpred: A Novel Sequence-Based Tool for Predicting Glutamic Proteases Using Optimized Hybrid Encodings

GPpred: A Novel Sequence-Based Tool for Predicting Glutamic Proteases Using Optimized Hybrid Encodings

Contact Info

Product

Resources

About