Background: Recent studies have found that women with obstetric disorders are at increased risk for a variety of long-term complications. However, the underlying pathophysiology of these connections remains undetermined. A network-based view incorporating knowledge of other diseases and genetic associations will aid our understanding of the role of genetics in pregnancy-related disease complications. Methods: We built a disease–disease network (DDN) using UK Biobank (UKBB) summary data from a phenome-wide association study (PheWAS) to elaborate multiple disease associations. We also constructed egocentric DDNs, where each network focuses on a pregnancy-related disorder and its neighboring diseases. We then applied graph-based semi-supervised learning (GSSL) to translate the connections in the egocentric DDNs to pathologic knowledge. Results: A total of 26 egocentric DDNs were constructed for each pregnancy-related phenotype in the UKBB. Applying GSSL to each DDN, we obtained complication risk scores for additional phenotypes given the pregnancy-related disease of interest. Predictions were validated using co-occurrences derived from UKBB electronic health records. Our proposed method achieved an increase in average area under the receiver operating characteristic curve (AUC) by a factor of 1.35 from 55.0% to 74.4% compared to the use of the full DDN. Conclusion: Egocentric DDNs hold promise as a clinical tool for the network-based identification of potential disease complications for a variety of phenotypes.
Background Drug repurposing has been motivated to ameliorate low probability of success in drug discovery. For the recent decade, many in silico attempts have received primary attention as a first step to alleviate the high cost and longevity. Such study has taken benefits of abundance, variety, and easy accessibility of pharmaceutical and biomedical data. Utilizing the research friendly environment, in this study, we propose a network-based machine learning algorithm for drug repurposing. Particularly, we show a framework on how to construct a drug network, and how to strengthen the network by employing multiple/heterogeneous types of data. Results The proposed method consists of three steps. First, we construct a drug network from drug-target protein information. Then, the drug network is reinforced by utilizing drug-drug interaction knowledge on bioactivity and/or medication from literature databases. Through the enhancement, the number of connected nodes and the number of edges between them become more abundant and informative, which can lead to a higher probability of success of in silico drug repurposing. The enhanced network recommends candidate drugs for repurposing through drug scoring. The scoring process utilizes graph-based semi-supervised learning to determine the priority of recommendations. Conclusions The drug network is reinforced in terms of the coverage and connections of drugs: the drug coverage increases from 4738 to 5442, and the drug-drug associations as well from 808,752 to 982,361. Along with the network enhancement, drug recommendation becomes more reliable: AUC of 0.89 was achieved lifted from 0.79. For typical cases, 11 recommended drugs were shown for vascular dementia: amantadine, conotoxin GV, tenocyclidine, cycloeucine, etc. Electronic supplementary material The online version of this article (10.1186/s12859-019-2858-6) contains supplementary material, which is available to authorized users.
Summary Immune diseases have a strong genetic component with Mendelian patterns of inheritance. While the tight association has been a major understanding in the underlying pathophysiology for the category of immune diseases, the common features of these diseases remain unclear. Based on the potential commonality among immune genes, we design Gene Ranker for key gene identification. Gene Ranker is a network-based gene scoring algorithm that initially constructs a backbone network based on protein interactions. Patient gene expression networks are added into the network. An add-on process screens the networks of weighted gene co-expression network analysis (WGCNA) on the samples of immune patients. Gene Ranker is disease-specific; however, any WGCNA network that passes the screening procedure can be added on. With the constructed network, it employs the semi-supervised learning for gene scoring. Results The proposed method was applied to immune diseases. Based on the resulting scores, Gene Ranker identified potential key genes in immune diseases. In scoring validation, an average area under the receiver operating characteristic curve of 0.82 was achieved, which is a significant increase from the reference average of 0.76. Highly ranked genes were verified through retrieval and review of 27 million PubMed literatures. As a typical case, 20 potential key genes in rheumatoid arthritis were identified: 10 were de facto genes and the remaining were novel. Availability and Implementation Gene Ranker is available at http://www.alphaminers.net/GeneRanker/ Supplementary information Supplementary data are available at Bioinformatics online.
BackgroundIn cancer prognosis research, diverse machine learning models have applied to the problems of cancer susceptibility (risk assessment), cancer recurrence (redevelopment of cancer after resolution), and cancer survivability, regarding an accuracy (or an AUC--the area under the ROC curve) as a primary measurement for the performance evaluation of the models. However, in order to help medical specialists to establish a treatment plan by using the predicted output of a model, it is more pragmatic to elucidate which variables (markers) have most significantly influenced to the resulting outcome of cancer or which patients show similar patterns.MethodsIn this study, a coupling approach of two sub-modules--a predictor and a descriptor--is proposed. The predictor module generates the predicted output for the cancer outcome. Semi-supervised learning co-training algorithm is employed as a predictor. On the other hand, the descriptor module post-processes the results of the predictor module, mainly focusing on which variables are more highly or less significantly ranked when describing the results of the prediction, and how patients are segmented into several groups according to the trait of common patterns among them. Decision trees are used as a descriptor.ResultsThe proposed approach, 'predictor-descriptor,' was tested on the breast cancer survivability problem based on the surveillance, epidemiology, and end results database for breast cancer (SEER). The results present the performance comparison among the established machine leaning algorithms, the ranks of the prognosis elements for breast cancer, and patient segmentation. In the performance comparison among the predictor candidates, Semi-supervised learning co-training algorithm showed best performance, producing an average AUC of 0.81. Later, the descriptor module found the top-tier prognosis markers which significantly affect to the classification results on survived/dead patients: 'lymph node involvement', 'stage', 'site-specific surgery', 'number of positive node examined', and 'tumor size', etc. Also, a typical example of patient-segmentation was provided: the patients classified as dead were grouped into two segments depending on difference in prognostic profiles, ones with serious results with respect to the pathologic exams and the others with the feebleness of age.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.