BackgroundOne of the greatest challenges in cancer genomics is to distinguish driver mutations from passenger mutations. Whereas recurrence is a hallmark of driver mutations, it is difficult to observe recurring noncoding mutations owing to a limited amount of whole-genome sequenced samples. Hence, it is required to develop a method to predict potentially recurrent mutations.ResultsIn this work, we developed a random forest classifier that predicts regulatory mutations that may recur based on the features of the mutations repeatedly appearing in a given cohort. With breast cancer as a model, we profiled 35 quantitative features describing genetic and epigenetic signals at the mutation site, transcription factors whose binding motif was disrupted by the mutation, and genes targeted by long-range chromatin interactions. A true set of mutations for machine learning was generated by interrogating publicly available pan-cancer genomes based on our statistical model of mutation recurrence. The performance of our random forest classifier was evaluated by cross validations. The variable importance of each feature in the classification of mutations was investigated. Our statistical recurrence model for the random forest classifier showed an area under the curve (AUC) of ~0.78 in predicting recurrent mutations. Chromatin accessibility at the mutation sites, the distance from the mutations to known cancer risk loci, and the role of the target genes in the regulatory or protein interaction network were among the most important variables.ConclusionsOur methods enable to characterize recurrent regulatory mutations using a limited number of whole-genome samples, and based on the characterization, to predict potential driver mutations whose recurrence is not found in the given samples but likely to be observed with additional samples.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1385-y) contains supplementary material, which is available to authorized users.
Global network modeling of distal regulatory interactions is essential in understanding the overall architecture of gene expression programs. Here, we developed a Bayesian probabilistic model and computational method for global causal network construction with breast cancer as a model. Whereas physical regulator binding was well supported by gene expression causality in general, distal elements in intragenic regions or loci distant from the target gene exhibited particularly strong functional effects. Modeling the action of long-range enhancers was critical in recovering true biological interactions with increased coverage and specificity overall and unraveling regulatory complexity underlying tumor subclasses and drug responses in particular. Transcriptional cancer drivers and risk genes were discovered based on the network analysis of somatic and genetic cancer-related DNA variants. Notably, we observed that the risk genes were functionally downstream of the cancer drivers and were selectively susceptible to network perturbation by tumorigenic changes in their upstream drivers. Furthermore, cancer risk alleles tended to increase the susceptibility of the transcription of their associated genes. These findings suggest that transcriptional cancer drivers selectively induce a combinatorial misregulation of downstream risk genes, and that genetic risk factors, mostly residing in distal regulatory regions, increase transcriptional susceptibility to upstream cancer-driving somatic changes.
Although there are many genetic loci in noncoding regions associated with vascular disease, studies on long noncoding RNAs (lncRNAs) discovered from human plaques that affect atherosclerosis have been highly limited. We aimed to identify and functionally validate a lncRNA using human atherosclerotic plaques. Human aortic samples were obtained from patients who underwent aortic surgery, and tissues were classified according to atherosclerotic plaques. RNA was extracted and analyzed for differentially expressed lncRNAs in plaques. Human aortic smooth muscle cells (HASMCs) were stimulated with oxidized low-density lipoprotein (oxLDL) to evaluate the effect of the identified lncRNA on the inflammatory transition of the cells. Among 380 RNAs differentially expressed between the plaque and control tissues, lncRNA HSPA7 was selected and confirmed to show upregulated expression upon oxLDL treatment. HSPA7 knockdown inhibited the migration of HASMCs and the secretion and expression of IL-1β and IL-6; however, HSPA7 knockdown recovered the oxLDL-induced reduction in the expression of contractile markers. Although miR-223 inhibition promoted the activity of Nf-κB and the secretion of inflammatory proteins such as IL-1β and IL-6, HSPA7 knockdown diminished these effects. The effects of miR-223 inhibition and HSPA7 knockdown were also found in THP-1 cell-derived macrophages. The impact of HSPA7 on miR-223 was mediated in an AGO2-dependent manner. HSPA7 is differentially increased in human atheroma and promotes the inflammatory transition of vascular smooth muscle cells by sponging miR-223. For the first time, this study elucidated the molecular mechanism of action of HSPA7, a lncRNA of previously unknown function, in humans.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.