Drug repositioning, discovering new indications for existing drugs, is known to solve the bottleneck of drug discovery and development. To support a task of drug repositioning, many in silico methods have been proposed for predicting drug-disease associations. A meta-path based approach, which extracts network-based information through paths from a drug to a disease, can produce comparable performance with less required information when compared to other approaches. However, existing metapath based methods typically use counts of extracted paths and discard information of intermediate nodes in those paths although they are very important indicators, such as drug-and disease-associated proteins. Herein, we propose an ensemble learning method with Meta-path based Gene ontology Profiles for predicting Drug-Disease Associations (MGP-DDA). We exploit gene ontology (GO) terms to link drugs and diseases to their associated functions and act as intermediate nodes in a drug-GO-disease tripartite network. For each drug-disease pair, MGP-DDA utilizes meta-paths to generate novel profiles of GO functions, termed as meta-path based GO profiles. We train bagging and boosting classifiers with those novel features to recognize known (positive) from unknown (unlabeled) drug-disease associations. Consequently, MGP-DDA outperforms the state-of-the-art methods and yields the precision of 88.6%. By MGP-DDA, the eminent number of new drug-disease associations with supporting evidence in ClinicalTrials.gov (37.7%) ensures the practicality of our method in drug repositioning. INDEX TERMS Drug-disease association, drug repositioning, ensemble learning, gene ontology profile, meta-path, tripartite network.
Identification of drug–target interaction (DTI) is a crucial step to reduce time and cost in the drug discovery and development process. Since various biological data are publicly available, DTIs have been identified computationally. To predict DTIs, most existing methods focus on a single similarity measure of drugs and target proteins, whereas some recent methods integrate a particular set of drug and target similarity measures by a single integration function. Therefore, many DTIs are still missing. In this study, we propose heterogeneous network propagation with the forward similarity integration (FSI) algorithm, which systematically selects the optimal integration of multiple similarity measures of drugs and target proteins. Seven drug–drug and nine target–target similarity measures are applied with four distinct integration methods to finally create an optimal heterogeneous network model. Consequently, the optimal model uses the target similarity based on protein sequences and the fused drug similarity, which combines the similarity measures based on chemical structures, the Jaccard scores of drug–disease associations, and the cosine scores of drug–drug interactions. With an accuracy of 99.8%, this model significantly outperforms others that utilize different similarity measures of drugs and target proteins. In addition, the validation of the DTI predictions of this model demonstrates the ability of our method to discover missing potential DTIs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.