With the advances in high-throughput technologies, millions of somatic mutations have been reported in the past decade. Identifying driver genes with oncogenic mutations from these data is a critical and challenging problem. Many computational methods have been proposed to predict driver genes. Among them, machine learning-based methods usually train a classifier with representations that concatenate various types of features extracted from different kinds of data. Although successful, simply concatenating different types of features may not be the best way to fuse these data. We notice that a few types of data characterize the similarities of genes, to better integrate them with other data and improve the accuracy of driver gene prediction, in this study, a deep learning-based method (deepDriver) is proposed by performing convolution on mutation-based features of genes and their neighbors in the similarity networks. The method allows the convolutional neural network to learn information within mutation data and similarity networks simultaneously, which enhances the prediction of driver genes. deepDriver achieves AUC scores of 0.984 and 0.976 on breast cancer and colorectal cancer, which are superior to the competing algorithms. Further evaluations of the top 10 predictions also demonstrate that deepDriver is valuable for predicting new driver genes.
Motivation Computationally predicting disease genes helps scientists optimize the in-depth experimental validation and accelerates the identification of real disease-associated genes. Modern high-throughput technologies have generated a vast amount of omics data, and integrating them is expected to improve the accuracy of computational prediction. As an integrative model, multimodal deep belief net (DBN) can capture cross-modality features from heterogeneous datasets to model a complex system. Studies have shown its power in image classification and tumor subtype prediction. However, multimodal DBN has not been used in predicting disease–gene associations. Results In this study, we propose a method to predict disease–gene associations by multimodal DBN (dgMDL). Specifically, latent representations of protein-protein interaction networks and gene ontology terms are first learned by two DBNs independently. Then, a joint DBN is used to learn cross-modality representations from the two sub-models by taking the concatenation of their obtained latent representations as the multimodal input. Finally, disease–gene associations are predicted with the learned cross-modality representations. The proposed method is compared with two state-of-the-art algorithms in terms of 5-fold cross-validation on a set of curated disease–gene associations. dgMDL achieves an AUC of 0.969 which is superior to the competing algorithms. Further analysis of the top-10 unknown disease–gene pairs also demonstrates the ability of dgMDL in predicting new disease–gene associations. Availability and implementation Prediction results and a reference implementation of dgMDL in Python is available on https://github.com/luoping1004/dgMDL. Supplementary information Supplementary data are available at Bioinformatics online.
HUMANOID is a user interface design tool that lets designers express abstract conceptualizations of an interface in an executable form, allowing designers to experiment with scenarios and dialogues even before the application model is completely worked out. Three properties of the HUMANOID approach allow it to do so a modtdarization of design issues into independent dimensions, support for multiple levels of specificity in mapping application models to user interface constructs, and mechanisms for constructing executable default user interface implementations from whatever level of specificity has been provided by the designer.
Disease gene prediction is a challenging task that has a variety of applications such as early diagnosis and drug development. The existing machine learning methods suffer from the imbalanced sample issue because the number of known disease genes (positive samples) is much less than that of unknown genes which are typically considered to be negative samples. In addition, most methods have not utilized clinical data from patients with a specific disease to predict disease genes. In this study, we propose a disease gene prediction algorithm (called dgSeq) by combining protein-protein interaction (PPI) network, clinical RNA-Seq data, and Online Mendelian Inheritance in Man (OMIN) data. Our dgSeq constructs differential networks based on rewiring information calculated from clinical RNA-Seq data. To select balanced sets of non-disease genes (negative samples), a disease-gene network is also constructed from OMIM data. After features are extracted from the PPI networks and differential networks, the logistic regression classifiers are trained. Our dgSeq obtains AUC values of 0.88, 0.83 and 0.80 for identifying breast cancer genes, thyroid cancer genes and Alzheimer's disease genes, respectively, which indicates its superiority to other three competing methods. Both gene set enrichment analysis and predicted results demonstrate that dgSeq can effectively predict new disease genes.
Background: Although structural and functional changes of the striatum and hippocampus are present in familial Alzheimer's disease, little is known about the effects of specific gene mutation or disease progression on their related neural circuits. This study was to evaluate the effects of known pathogenic gene mutation and disease progression on the striatum-and hippocampus-related neural circuits, including frontostriatal and hippocampusposterior cingulate cortex (PCC) pathways. Methods: A total of 102 healthy mutation non-carriers, 40 presymptomatic mutation carriers (PMC), and 30 symptomatic mutation carriers (SMC) of amyloid precursor protein (APP), presenilin 1 (PS1), or presenilin 2 gene, with T1 structural MRI, diffusion tensor imaging, and resting-state functional MRI were included. Representative neural circuits and their key nodes were obtained, including bilateral caudate-rostral middle frontal gyrus (rMFG), putamen-rMFG, and hippocampus-PCC. Volumes, diffusion indices, and functional connectivity of circuits were compared between groups and correlated with neuropsychological and clinical measures. Results: In PMC, APP gene mutation carriers showed impaired diffusion indices of caudate-rMFG and putamen-rMFG circuits; PS1 gene mutation carriers showed increased fiber numbers of putamen-rMFG circuit. SMC showed increased diffusivity of the left hippocampus-PCC circuit and volume reduction of all regions as compared with PMC. Imaging measures especially axial diffusivity of the representative circuits were correlated with neuropsychological measures. Conclusions: APP and PS1 gene mutations affect frontostriatal circuits in a different manner in familial Alzheimer's disease; disease progression primarily affects the structure of hippocampus-PCC circuit. The structural connectivity of both frontostriatal and hippocampus-PCC circuits is associated with general cognitive function. Such findings may provide further information about the imaging biomarkers for early identification and prognosis of familial Alzheimer's disease, and pave the way for early diagnosis, gene-or circuit-targeted treatment, and even prevention.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.