In this study, we proposed an ensemble learning method, simultaneously integrating a low-rank matrix completion model and a ridge regression model to predict anticancer drug response on cancer cell lines. The model was applied to two benchmark datasets, including the Cancer Cell Line Encyclopedia (CCLE) and the Genomics of Drug Sensitivity in Cancer (GDSC). As previous studies suggest, the dual-layer integrated cell line-drug network model was one of the best models by far and outperformed most state-of-the-art models. Thus, we performed a head-to-head comparison between the dual-layer integrated cell line-drug network model and our model by a 10-fold crossvalidation study. For the CCLE dataset, our model has a higher Pearson correlation coefficient between predicted and observed drug responses than that of the dual-layer integrated cell line-drug network model in 18 out of 23 drugs. For the GDSC dataset, our model is better in 26 out of 28 drugs in the phosphatidylinositol 3-kinase (PI3K) pathway and 26 out of 30 drugs in the extracellular signal-regulated kinase (ERK) signaling pathway, respectively. Based on the prediction results, we carried out two types of case studies, which further verified the effectiveness of the proposed model on the drug-response prediction. In addition, our model is more biologically interpretable than the compared method, since it explicitly outputs the genes involved in the prediction, which are enriched in functions, like transcription, Src homology 2/3 (SH2/3) domain, cell cycle, ATP binding, and zinc finger.
BackgroundAccurate prediction of anticancer drug responses in cell lines is a crucial step to accomplish the precision medicine in oncology. Although many popular computational models have been proposed towards this non-trivial issue, there is still room for improving the prediction performance by combining multiple types of genome-wide molecular data.ResultsWe first demonstrated an observation on the CCLE and GDSC datasets, i.e., genetically similar cell lines always exhibit higher response correlations to structurally related drugs. Based on this observation we built a cell line-drug complex network model, named CDCN model. It captures different contributions of all available cell line-drug responses through cell line similarities and drug similarities. We executed anticancer drug response prediction on CCLE and GDSC independently. The result is significantly superior to that of some existing studies. More importantly, our model could predict the response of new drug to new cell line with considerable performance. We also divided all possible cell lines into “sensitive” and “resistant” groups by their response values to a given drug, the prediction accuracy, sensitivity, specificity and goodness of fit are also very promising.ConclusionCDCN model is a comprehensive tool to predict anticancer drug responses. Compared with existing methods, it is able to provide more satisfactory prediction results with less computational consumption.Electronic supplementary materialThe online version of this article (10.1186/s12859-019-2608-9) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.