BackgroundDruggable proteins are a trending topic in drug design. The druggable proteome can be defined as the percentage of proteins that have the capacity to bind an antibody or small molecule with adequate chemical properties and affinity. The screening and in silico modeling are critical activities for the reduction of experimental costs.MethodsThe current work proposes a unique prediction model for druggable proteins using amino acid composition descriptors of protein sequences and 13 machine learning linear and non-linear classifiers. After feature selection, the best classifier was obtained using the support vector machine method and 200 tri-amino acid composition descriptors.ResultsThe high performance of the model is determined by an area under the receiver operating characteristics (AUROC) of 0.975 ± 0.003 and accuracy of 0.929 ± 0.006 (3-fold cross-validation). Regarding the prediction of cancer-associated proteins using this model, the best ranked druggable predicted proteins in the breast cancer protein set were CDK4, AP1S1, POLE, HMMR, RPL5, PALB2, TIMP1, RPL22, NFKB1 and TOP2A; in the cancer-driving protein set were TLL2, FAM47C, SAGE1, HTR1E, MACC1, ZFR2, VMA21, DUSP9, CTNNA3 and GABRG1; and in the RNA-binding protein set were PLA2G1B, CPEB2, NOL6, LRRC47, CTTN, CORO1A, SCAF11, KCTD12, DDX43 and TMPO.ConclusionsThis powerful model predicts several druggable proteins which should be deeply studied to find better therapeutic targets and thus improve clinical trials. The scripts are freely available at https://github.com/muntisa/machine-learning-for-druggable-proteins.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.