BackgroundDruggable proteins are a trending topic in drug design. The druggable proteome can be defined as the percentage of proteins that have the capacity to bind an antibody or small molecule with adequate chemical properties and affinity. The screening and in silico modeling are critical activities for the reduction of experimental costs.MethodsThe current work proposes a unique prediction model for druggable proteins using amino acid composition descriptors of protein sequences and 13 machine learning linear and non-linear classifiers. After feature selection, the best classifier was obtained using the support vector machine method and 200 tri-amino acid composition descriptors.ResultsThe high performance of the model is determined by an area under the receiver operating characteristics (AUROC) of 0.975 ± 0.003 and accuracy of 0.929 ± 0.006 (3-fold cross-validation). Regarding the prediction of cancer-associated proteins using this model, the best ranked druggable predicted proteins in the breast cancer protein set were CDK4, AP1S1, POLE, HMMR, RPL5, PALB2, TIMP1, RPL22, NFKB1 and TOP2A; in the cancer-driving protein set were TLL2, FAM47C, SAGE1, HTR1E, MACC1, ZFR2, VMA21, DUSP9, CTNNA3 and GABRG1; and in the RNA-binding protein set were PLA2G1B, CPEB2, NOL6, LRRC47, CTTN, CORO1A, SCAF11, KCTD12, DDX43 and TMPO.ConclusionsThis powerful model predicts several druggable proteins which should be deeply studied to find better therapeutic targets and thus improve clinical trials. The scripts are freely available at https://github.com/muntisa/machine-learning-for-druggable-proteins.