The human body is composed of diverse cell types with distinct functions. While it is known that lineage specification depends on cell specific gene expression, which in turn is driven by promoters, enhancers, insulators and other cis-regulatory DNA sequences for each gene1–3, the relative roles of these regulatory elements in this process is not clear. We have previously developed a chromatin immunoprecipitation-based microarray method (ChIP-chip) to locate promoters, enhancers, and insulators in the human genome4–6. Here, we use the same approach to identify these elements in multiple cell types and investigated their roles in cell type-specific gene expression. We observed that chromatin state at promoters and CTCF-binding at insulators are largely invariant across diverse cell types. By contrast, enhancers are marked with highly cell type-specific histone modification patterns, strongly correlate to cell type-specific gene expression programs on a global scale, and are functionally active in a cell type-specific manner. Our results defined over 55,000 potential transcriptional enhancers in the human genome, significantly expanding the current catalog of human enhancers and highlighting the role of these elements in cell type-specific gene expression.
Higher order chromatin structure is emerging as an important regulator of gene expression. Although dynamic chromatin structures have been identified in the genome, the full scope of chromatin dynamics during mammalian development and lineage specification remains obscure. By mapping genome-wide chromatin interactions in human embryonic stem cells (hESC) and four hESC-derived lineages, we uncover extensive chromatin reorganization during lineage specification. We observe that while topological domain boundaries remain intact during differentiation, interactions both within and between domains change dramatically, altering 36% of active and inactive chromosomal “compartments” throughout the genome. By integrating chromatin interaction maps with haplotype-resolved epigenome and transcriptome datasets, we find widespread allelic bias in gene expression correlated with allele-biased chromatin states of linked promoters and distal enhancers. Our results therefore provide a global view of chromatin dynamics and a resource for studying long-range control of gene expression in distinct human cell lineages.
The laboratory mouse is the most widely used mammalian model organism in biomedical research. The 2.6 × 109 bases of the mouse genome possess a high degree of conservation with the human genome1, so a thorough annotation of the mouse genome will be of significant value to understanding the function of the human genome. So far, most of the functional sequences in the mouse genome have yet to be found, and the cis-regulatory sequences in particular are still poorly annotated. Comparative genomics has been a powerful tool for the discovery of these sequences2, but on its own it cannot resolve their temporal and spatial functions. Recently, ChIP-Seq has been developed to identify cis-regulatory elements in the genomes of several organisms including humans, Drosophila melanogaster and Caenorhabditis elegans3–5. Here we apply the same experimental approach to a diverse set of 19 tissues and cell types in the mouse to produce a map of nearly 300,000 murine cis-regulatory sequences. The annotated sequences add up to 11% of the mouse genome, and include more than 70% of conserved non-coding sequences. We define tissue-specific enhancers and identify potential transcription factors regulating gene expression in each tissue or cell type. Finally, we show that much of the mouse genome is organized in to domains of coordinately regulated enhancers and promoters. Our results provide a resource for the annotation of functional elements in the mammalian genome and for the study of mechanisms regulating tissue-specific gene expression.
Insulator elements affect gene expression by preventing the spread of heterochromatin and restricting transcriptional enhancers from activation of unrelated promoters. In vertebrates, insulator's function requires association with the CCCTC-binding factor (CTCF), a protein that recognizes long and diverse nucleotide sequences. While insulators are critical in gene regulation, only a few have been reported. Here, we describe 13,804 CTCF-binding sites in potential insulators of the human genome, discovered experimentally in primary human fibroblasts. Most of these sequences are located far from the transcriptional start sites, with their distribution strongly correlated with genes. The majority of them fit to a consensus motif highly conserved and suitable for predicting possible insulators driven by CTCF in other vertebrate genomes. In addition, CTCF localization is largely invariant across different cell types. Our results provide a resource for investigating insulator function and possible other general and evolutionarily conserved activities of CTCF sites.
We have isolated and analyzed human CTCF cDNA clones and show here that the ubiquitously expressed 11-zinc-finger factor CTCF is an exceptionally highly conserved protein displaying 93% identity between avian and human amino acid sequences. It binds specifically to regulatory sequences in the promoter-proximal regions of chicken, mouse, and human c-myc oncogenes. CTCF contains two transcription repressor domains transferable to a heterologous DNA binding domain. One CTCF binding site, conserved in mouse and human c-myc genes, is found immediately downstream of the major P2 promoter at a sequence which maps precisely within the region of RNA polymerase II pausing and release. Gel shift assays of nuclear extracts from mouse and human cells show that CTCF is the predominant factor binding to this sequence. Mutational analysis of the P2-proximal CTCF binding site and transient-cotransfection experiments demonstrate that CTCF is a transcriptional repressor of the human c-myc gene. Although there is 100% sequence identity in the DNA binding domains of the avian and human CTCF proteins, the regulatory sequences recognized by CTCF in chicken and human c-myc promoters are clearly diverged. Mutating the contact nucleotides confirms that CTCF binding to the human c-myc P2 promoter requires a number of unique contact DNA bases that are absent in the chicken c-myc CTCF binding site. Moreover, proteolytic-protection assays indicate that several more CTCF Zn fingers are involved in contacting the human CTCF binding site than the chicken site. Gel shift assays utilizing successively deleted Zn finger domains indicate that CTCF Zn fingers 2 to 7 are involved in binding to the chicken c-myc promoter, while fingers 3 to 11 mediate CTCF binding to the human promoter. This flexibility in Zn finger usage reveals CTCF to be a unique "multivalent" transcriptional factor and provides the first feasible explanation of how certain homologous genes (i.e., c-myc) of different vertebrate species are regulated by the same factor and maintain similar expression patterns despite significant promoter sequence divergence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.