Genome-wide association studies (GWAS) have identified >100 independent SNPs that modulate the risk of type 2 diabetes (T2D) and related traits. However, the pathogenic mechanisms of most of these SNPs remain elusive. Here, we examined genomic, epigenomic, and transcriptomic profiles in human pancreatic islets to understand the links between genetic variation, chromatin landscape, and gene expression in the context of T2D. We first integrated genome and transcriptome variation across 112 islet samples to produce dense cis-expression quantitative trait loci (cis-eQTL) maps. Additional integration with chromatin-state maps for islets and other diverse tissue types revealed that cis-eQTLs for islet-specific genes are specifically and significantly enriched in islet stretch enhancers. High-resolution chromatin accessibility profiling using assay for transposase-accessible chromatin sequencing (ATACseq) in two islet samples enabled us to identify specific transcription factor (TF) footprints embedded in active regulatory elements, which are highly enriched for islet cis-eQTL. Aggregate allelic bias signatures in TF footprints enabled us de novo to reconstruct TF binding affinities genetically, which support the high-quality nature of the TF footprint predictions. Interestingly, we found that T2D GWAS loci were strikingly and specifically enriched in islet Regulatory Factor X (RFX) footprints. Remarkably, within and across independent loci, T2D risk alleles that overlap with RFX footprints uniformly disrupt the RFX motifs at high-information content positions. Together, these results suggest that common regulatory variations have shaped islet TF footprints and the transcriptome and that a confluent RFX regulatory grammar plays a significant role in the genetic component of T2D predisposition.chromatin | diabetes | eQTL | epigenome | footprint T ype 2 diabetes (T2D) is a complex disease characterized by pancreatic islet dysfunction and insulin resistance in peripheral tissues; >90% of T2D SNPs identified through genome-wide association studies (GWASs) reside in nonprotein coding regions and are likely to perturb gene expression rather than alter protein function (1). In support of this finding, we and others recently showed that T2D GWAS SNPs are significantly enriched in enhancer elements that are specific to pancreatic islets (2-4). The critical next steps to translate these islet enhancer T2D genetic associations into mechanistic biological knowledge are (i) identifying the putative functional SNP(s) from all of those that are in tight linkage disequilibrium (LD), (ii) localizing their target gene(s), and (iii) understanding the direction of effect (increased or decreased target gene expression) conferred by the risk allele. Two recent studies analyzed genome variation and gene expression variation across human islet samples to identify cis-expression quantitative trait loci (cis-eQTLs) that linked T2D GWAS SNPs to target genes (5, 6). However, the transcription factor (TF) molecular mediators of the islet cis-eQTLs...
Type 2 diabetes (T2D) results from the combined effects of genetic and environmental factors on multiple tissues over time. Of the >100 variants associated with T2D and related traits in genome-wide association studies (GWAS), >90% occur in non-coding regions, suggesting a strong regulatory component to T2D risk. Here to understand how T2D status, metabolic traits and genetic variation influence gene expression, we analyse skeletal muscle biopsies from 271 well-phenotyped Finnish participants with glucose tolerance ranging from normal to newly diagnosed T2D. We perform high-depth strand-specific mRNA-sequencing and dense genotyping. Computational integration of these data with epigenome data, including ATAC-seq on skeletal muscle, and transcriptome data across diverse tissues reveals that the tissue-specific genetic regulatory architecture of skeletal muscle is highly enriched in muscle stretch/super enhancers, including some that overlap T2D GWAS variants. In one such example, T2D risk alleles residing in a muscle stretch/super enhancer are linked to increased expression and alternative splicing of muscle-specific isoforms of ANK1.
Highlights d ataqv is a software package for ATAC-seq quality control (QC) and visualization d We show extensive variation in QC metrics for 2,009 public ATAC-seq datasets d Increased Tn5 dosage increases power to detect almost all regulatory genomic features d CTCF is a notable Tn5 dosage-insensitive factor
Interactions between transcription factors and chromatin are fundamental to genome organization and regulation and, ultimately, cell state. Here, we use information theory to measure signatures of organized chromatin resulting from transcription factor-chromatin interactions encoded in the patterns of the accessible genome, which we term chromatin information enrichment (CIE). We calculate CIE for hundreds of transcription factor motifs across human samples and identify two classes: low and high CIE. The 10–20% of common and tissue-specific high CIE transcription factor motifs, associate with higher protein–DNA residence time, including different binding site subclasses of the same transcription factor, increased nucleosome phasing, specific protein domains, and the genetic control of both chromatin accessibility and gene expression. These results show that variations in the information encoded in chromatin architecture reflect functional biological variation, with implications for cell state dynamics and memory.
In vertebrates, multiple transcription factors (TFs) bind to gene regulatory elements (promoters, enhancers, and silencers) to execute developmental expression changes. ChIP experiments are often used to identify where TFs bind to regulatory elements in the genome, but the requirement of TF-specific antibodies hampers analyses of tens of TFs at multiple loci. Here we tested whether TF binding predictions using ATAC-seq can be used to infer the identity of TFs that bind to functionally validated enhancers of the Cd4, Cd8, and Gata3 genes in thymocytes. We performed ATAC-seq at four distinct stages of development in mouse thymus, probing the chromatin accessibility landscape in double negative (DN), double positive (DP), CD4 single positive (SP4) and CD8 SP (SP8) thymocytes. Integration of chromatin accessibility with TF motifs genome-wide allowed us to infer stage-specific occupied TF binding sites within known and potentially novel regulatory elements. Our results provide genome-wide stage-specific T cell open chromatin profiles, and allow the identification of candidate TFs that drive thymocyte differentiation at each developmental stage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.