Microbes with abnormal levels have important impacts on the formation and development of various complex diseases. Identifying possible Microbe-Disease Associations (MDAs) helps to understand the mechanisms of complex diseases. However, experimental methods for MDA identification are costly and time-consuming. In this study, a new computational model, RNMFMDA, was developed to find possible MDAs. RNMFMDA contains two main processes. First, Reliable Negative MDA samples were selected based on Positive-Unlabeled (PU) learning and random walk with restart on the heterogeneous microbe-disease network. Second, Logistic Matrix Factorization with Neighborhood Regularization (LMFNR) was developed to compute the association probabilities for all microbe-disease pairs. To evaluate the performance of the proposed RNMFMDA method, we compared RNMFMDA with five state-of-the-art MDA prediction methods based on five-fold cross-validations on microbes, diseases, and MDAs. As a result, RNMFMDA obtained the best AUCs of 0.6332, 0.8669, and 0.9081, respectively for the three five-fold cross validations, significantly outperforming other models. The promising prediction performance may be attributed to the following three features: highly quality negative MDA sample selection, LMFNR-based MDA prediction model, and various biological information integration. In addition, a few predicted microbe-disease pairs with high association scores are worthy of further experimental validation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.