Prioritizing candidate disease-related genes using computational methods and biological networks data is an important problem in bioinformatics. Random walk with restart (RWR) algorithm is widely used for this problem due to its relatively high accuracy. However, RWR is computationally expensive as it considers every node in a network. Here we propose to use a new method for prioritizing candidate genes, in which genes with low probability of association with disease genes are excluded from further consideration, thus reducing computational complexity. Experiments on real protein interaction networks show that the proposed method was computationally efficient, and more accurate than RWR, as measured by AUC scores. We applied the proposed method to prioritizing candidate genes for human diabetes type 2. The results were promising: among top 20 ranked genes, 11 are associated with diabetes, as reported in the biomedical literature.
Disease gene prioritization is the process of ranking candidate genes according to their relevance to a disease phenotype, thus facilitating the identification of disease genes by narrowing down the set of genes to be tested experimentally. Many methods have been proposed for disease gene prioritization based on relationships between proteins encoded in protein-protein interaction networks using various graph-based algorithms. In this paper, we propose a novel method for prioritizing candidate disease genes by combining reinforcement learning with PageRank algorithm and assigning priors for known disease genes. We experimentally evaluate the proposed method on a human protein interaction network and compared its performance with a state-of-the-art methods, namely PageRank with priors, Random Walk with Restart and K-Step Markov. The experiment results show that our method achieves relatively high performance in terms of AUC values and outperforms comparative methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.