The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but a similar reference has lacked for epigenomic studies. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection to-date of human epigenomes for primary cells and tissues. Here, we describe the integrative analysis of 111 reference human epigenomes generated as part of the program, profiled for histone modification patterns, DNA accessibility, DNA methylation, and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically-relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation, and human disease.
The pan-cancer analysis of whole genomes The expansion of whole-genome sequencing studies from individual ICGC and TCGA working groups presented the opportunity to undertake a meta-analysis of genomic features across tumour types. To achieve this, the PCAWG Consortium was established. A Technical Working Group implemented the informatics analyses by aggregating the raw sequencing data from different working groups that studied individual tumour types, aligning the sequences to the human genome and delivering a set of high-quality somatic mutation calls for downstream analysis (Extended Data Fig. 1). Given the recent meta-analysis
Histones are frequently decorated with covalent modifications. These histone modifications are thought to be involved in various chromatin-dependent processes including transcription. To elucidate the relationship between histone modifications and transcription, we derived quantitative models to predict the expression level of genes from histone modification levels. We found that histone modification levels and gene expression are very well correlated. Moreover, we show that only a small number of histone modifications are necessary to accurately predict gene expression. We show that different sets of histone modifications are necessary to predict gene expression driven by high CpG content promoters (HCPs) or low CpG content promoters (LCPs). Quantitative models involving H3K4me3 and H3K79me1 are the most predictive of the expression levels in LCPs, whereas HCPs require H3K27ac and H4K20me1. Finally, we show that the connections between histone modifications and gene expression seem to be general, as we were able to predict gene expression levels of one cell type using a model trained on another one.high CpG content promoter | low CpG content promoter | regression analysis | transcription T he DNA of eukaryotic organisms is packaged into chromatin, whose basic repeating unit is the nucleosome. A nucleosome is formed by wrapping 147 base pairs of DNA around an octamer of four core histones, H2A, H2B, H3, and H4 (1-5) which are subject to a number of posttranslational covalent modifications [(6); for review, see ref. 7]. These modifications can alter the chromatin structure and function by changing the charge of the nucleosome particle, and/or by recruiting protein complexes either individually or in combination (8). Hence, histone modifications are thought to constitute a "Histone Code," which is read out by proteins to bring about specific downstream effects (9, 10).Histone modifications have been linked to a number of chromatin-dependent processes, including replication, DNA-repair, and transcription. The link between histone modifications and transcription has been particularly intensively studied. It has been found that individual modifications can be associated with transcriptional activation or repression. Acetylation and phosphorylation generally accompany transcription; sumoylation, deimination, and proline isomerization are usually found in transcriptionally silent regions; methylation and ubiquitination are implicated in both activation and repression of transcription (8). Furthermore, the establishment of some modifications is dependent on the presence of other modifications, e.g., the catalysis of H3K4me3 requires the presence of H2BK120ub1 (the so-called trans-tail pathway) and the phosphorylation on serine 5 on the C-terminal domain of RNA polymerase II (pol II) (for review, see ref. 11, which also reviews other examples for the combinatorial action of histone modifications).Transcription proceeds in a series of steps, also referred to as transcription cycle, starting with preinitiation complex form...
Cancer is a disease potentiated by mutations in somatic cells. Cancer mutations are not distributed uniformly along the genome. Instead, different genomic regions vary by up to 5-fold in the local density of somatic mutations1, posing a fundamental problem for statistical methods of cancer genomics. Epigenomic organization has been proposed as a major determinant of the cancer mutational landscape1-5. However, both somatic mutagenesis and epigenomic features are highly cell-type-specific6,7. We investigated the distribution of mutations in multiple samples of diverse cancer types and compared them to cell-type-specific epigenomic features. Here, we show that chromatin accessibility and modification, together with replication timing, explain up to 86% of the variance in mutation rates along cancer genomes. Overwhelmingly, the best predictors of local somatic mutation density are epigenomic features derived from the most likely cell type of origin of the corresponding malignancy. Moreover, we find that cell-of-origin chromatin features are much stronger determinants of cancer mutation profiles than chromatin features of cognate cancer cell lines. We show further that the cell type of origin of a cancer can be accurately determined based on the distribution of mutations along its genome. Thus, DNA sequence of a cancer genome encompasses a wealth of information about the identity and epigenomic features of its cell of origin.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.