Comprehensive understanding of the human protein-protein interaction (PPI) network, aka the human interactome, can provide important insights into the molecular mechanisms of complex biological processes and diseases. Despite the remarkable experimental efforts undertaken to date to determine the structure of the human interactome, many PPIs remain unmapped. Computational approaches, especially network-based methods, can facilitate the identification of previously uncharacterized PPIs. Many such methods have been proposed. Yet, a systematic evaluation of existing network-based methods in predicting PPIs is still lacking. Here, we report community efforts initiated by the International Network Medicine Consortium to benchmark the ability of 26 representative network-based methods to predict PPIs across six different interactomes of four different organisms: A. thaliana, C. elegans, S. cerevisiae, and H. sapiens. Through extensive computational and experimental validations, we found that advanced similarity-based methods, which leverage the underlying network characteristics of PPIs, show superior performance over other general link prediction methods in the interactomes we considered.
Comprehensive insights from the human protein-protein interaction (PPI) network, known as the human interactome, can provide important insights into the molecular mechanisms of complex biological processes and diseases. Despite the remarkable experimental efforts undertaken to date to determine the structure of the human interactome, many PPIs remain unmapped. Computational approaches, especially network-based methods, can facilitate the identification of new PPIs. Many such approaches have been proposed. However, a systematic evaluation of existing network-based methods in predicting PPIs is still lacking. Here, we report community efforts initiated by the International Network Medicine Consortium to benchmark the ability of 24 representative network-based methods to predict PPIs across five different interactomes, including a synthetic interactome generated by the duplication-mutation-complementation model, and the interactomes of four different organisms: A. thaliana, C. elegans, S. cerevisiae, and H. sapiens. We selected the top-seven methods through a computational validation on the human interactome. We next experimentally validated their top-500 predicted PPIs (in total 3,276 predicted PPIs) using the yeast two-hybrid assay, finding 1,177 new human PPIs (involving 633 proteins). Our results indicate that task-tailored similarity-based methods, which leverage the underlying network characteristics of PPIs, show superior performance over other general link prediction methods. Through experimental validation, we confirmed that the top-ranking methods show promising performance externally. For example, from the top 500 PPIs predicted by an advanced similarity-base method [MPS(B&T)], 430 were successfully tested by Y2H with 376 testing positive, yielding a precision of 87.4%. These results establish advanced similarity-based methods as powerful tools for the prediction of human PPIs.
Background The investigation of possible interactions between two proteins in intracellular signaling is an expensive and laborious procedure in the wet-lab, therefore, several in silico approaches have been implemented to narrow down the candidates for future experimental validations. Reformulating the problem in the field of network theory, the set of proteins can be represented as the nodes of a network, while the interactions between them as the edges. The resulting protein–protein interaction (PPI) network enables the use of link prediction techniques in order to discover new probable connections. Therefore, here we aimed to offer a novel approach to the link prediction task in PPI networks, utilizing a generative machine learning model. Results We created a tool that consists of two modules, the data processing framework and the machine learning model. As data processing, we used a modified breadth-first search algorithm to traverse the network and extract induced subgraphs, which served as image-like input data for our model. As machine learning, an image-to-image translation inspired conditional generative adversarial network (cGAN) model utilizing Wasserstein distance-based loss improved with gradient penalty was used, taking the combined representation from the data processing as input, and training the generator to predict the probable unknown edges in the provided induced subgraphs. Our link prediction tool was evaluated on the protein–protein interaction networks of five different species from the STRING database by calculating the area under the receiver operating characteristic, the precision-recall curves and the normalized discounted cumulative gain (AUROC, AUPRC, NDCG, respectively). Test runs yielded the averaged results of AUROC = 0.915, AUPRC = 0.176 and NDCG = 0.763 on all investigated species. Conclusion We developed a software for the purpose of link prediction in PPI networks utilizing machine learning. The evaluation of our software serves as the first demonstration that a cGAN model, conditioned on raw topological features of the PPI network, is an applicable solution for the PPI prediction problem without requiring often unavailable molecular node attributes. The corresponding scripts are available at https://github.com/semmelweis-pharmacology/ppi_pred.
Introduction Signal detection yields confirmed signals in only 2.1%, which imposes a heavy burden on the pharmacovigilance system in the European Union. Objectives We aimed to develop a network theoretical metric to increase the confirmed signal ratio of individual case safety report (ICSR) networks. Methods ICSRs of five cardiovascular adverse events were requested from EudraVigilance. We developed Vigilace™, a web-based application to build network representation of ICSRs. Three network-based signal scores, which we termed NEWS (normalized edge weight for signals) scores, were calculated by normalizing the weight of each edge in the reportbased weighted network by the weight of the same edge in topological weighted networks. Depending on the third node in topological network edges, we defined full-, adverse event-, and drug-type NEWS scores. Area under the receiver operating characteristic curves (AUROC) were analyzed to compare the reporting odds ratio (ROR) and NEWS scores. Results Overall, 72,475 ICSRs were accessed from EudraVigilance. Drug-type NEWS (NEWS D ) score performed better (DeLong test, p-value <0.05) compared with the ROR in case of four adverse events: acute myocardial infarction (AUROC: 0.856 vs. 0.720), arrhythmia (0.657 vs. 0.614), pulmonary hypertension (0.861 vs. 0.720), and QT prolongation (0.830 vs. 0.749). Postural orthostatic tachycardia syndrome was excluded due to the lack of reference data. Conclusion This is the first demonstration that report-based weighting normalized by topological weighting of co-reported drugs, which we termed as NEWS D score, can perform better compared with the ROR. An application was developed for ICSR network analysis that facilitates the calculation of this score.Bence Ágg and Péter Ferdinandy contributed equally as last authors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.