BackgroundThe genome-wide identification of both morbid genes, i.e., those genes whose mutations cause hereditary human diseases, and druggable genes, i.e., genes coding for proteins whose modulation by small molecules elicits phenotypic effects, requires experimental approaches that are time-consuming and laborious. Thus, a computational approach which could accurately predict such genes on a genome-wide scale would be invaluable for accelerating the pace of discovery of causal relationships between genes and diseases as well as the determination of druggability of gene products.ResultsIn this paper we propose a machine learning-based computational approach to predict morbid and druggable genes on a genome-wide scale. For this purpose, we constructed a decision tree-based meta-classifier and trained it on datasets containing, for each morbid and druggable gene, network topological features, tissue expression profile and subcellular localization data as learning attributes. This meta-classifier correctly recovered 65% of known morbid genes with a precision of 66% and correctly recovered 78% of known druggable genes with a precision of 75%. It was than used to assign morbidity and druggability scores to genes not known to be morbid and druggable and we showed a good match between these scores and literature data. Finally, we generated decision trees by training the J48 algorithm on the morbidity and druggability datasets to discover cellular rules for morbidity and druggability and, among the rules, we found that the number of regulating transcription factors and plasma membrane localization are the most important factors to morbidity and druggability, respectively.ConclusionsWe were able to demonstrate that network topological features along with tissue expression profile and subcellular localization can reliably predict human morbid and druggable genes on a genome-wide scale. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing morbidity and druggability.
The transcription process is crucial to life and the enzyme RNA polymerase (RNAP) is the major component of the transcription machinery. The development of single-molecule techniques, such as magnetic and optical tweezers, atomic-force microscopy and single-molecule fluorescence, increased our understanding of the transcription process and complements traditional biochemical studies. Based on these studies, theoretical models have been proposed to explain and predict the kinetics of the RNAP during the polymerization, highlighting the results achieved by models based on the thermodynamic stability of the transcription elongation complex. However, experiments showed that if more than one RNAP initiates from the same promoter, the transcription behavior slightly changes and new phenomenona are observed. We proposed and implemented a theoretical model that considers collisions between RNAPs and predicts their cooperative behavior during multi-round transcription generalizing the Bai et al. stochastic sequence-dependent model. In our approach, collisions between elongating enzymes modify their transcription rate values. We performed the simulations in Mathematica® and compared the results of the single and the multiple-molecule transcription with experimental results and other theoretical models. Our multi-round approach can recover several expected behaviors, showing that the transcription process for the studied sequences can be accelerated up to 48% when collisions are allowed: the dwell times on pause sites are reduced as well as the distance that the RNAPs backtracked from backtracking sites.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.