Enhancer elements in the human genome control how genes are expressed in specific cell types and harbor thousands of genetic variants that influence risk for common diseases [1][2][3][4] . Yet, we still do not know how enhancers regulate specific genes, and we lack general rules to predict enhancer-Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:
Mammalian genomes harbor millions of noncoding elements called enhancers that quantitatively regulate gene expression, but it remains unclear which enhancers regulate which genes. Here we describe an experimental approach, based on CRISPR interference, RNA FISH, and flow cytometry (CRISPRi-FlowFISH), to perturb enhancers in the genome, and apply it to test >3,000 potential regulatory enhancer-gene connections across multiple genomic loci. A simple equation based on a mechanistic model for enhancer function performed remarkably well at predicting the complex patterns of regulatory connections we observe in our CRISPR dataset. This Activity-by-Contact (ABC) model involves multiplying measures of enhancer activity and enhancer-promoter 3D contacts, and can predict enhancer-gene connections in a given cell type based on chromatin state maps. Together, CRISPRi-FlowFISH and the ABC model provide a systematic approach to map and predict which enhancers regulate which genes, and will help to interpret the functions of the thousands of disease risk variants in the noncoding genome.We defined Activity (A) as the geometric mean of the read counts of DHS and H3K27ac ChIP-Seq at an element E, and Contact (C) as the normalized Hi-C contact frequency between E and the promoter of gene G (see Methods). (The ABC score performed similarly across a range of data preprocessing parameters, and when defining Activity using other combinations of measurements of chromatin accessibility, histone modifications, and nascent transcription, see Methods, Fig. S6,S7,S8).The ABC model performed remarkably well, and much better than alternatives, at predicting DE-G connections in our CRISPR dataset. The quantitative ABC score correlated with the experimentally measured relative effects of candidate elements on gene expression (Spearman ρ for regulatory DE-G pairs = -0.68 Fig. 3C). Binary classifiers based on thresholds on the ABC score substantially outperformed existing predictors of enhancer-gene regulation. For example, when we used an ABC threshold corresponding to 70% recall, the predictions had 63% precision, and the area under precision-recall curve (AUPRC) was 0.66, compared to 0.36 for predictions based solely on genomic distance (Fig. 3A).
Genome-wide association studies have now identified tens of thousands of noncoding loci associated with human diseases and complex traits, each of which could reveal insights into biological mechanisms of disease. Many of the underlying causal variants are thought to affect enhancers, but we have lacked genome-wide maps of enhancer-gene regulation to interpret such variants. We previously developed the Activity-by-Contact (ABC) Model to predict enhancer-gene connections and demonstrated that it can accurately predict the results of CRISPR perturbations across several cell types. Here, we apply this ABC Model to create enhancer-gene maps in 131 cell types and tissues, and use these maps to interpret the functions of fine-mapped GWAS variants. For inflammatory bowel disease (IBD), causal variants are >20-fold enriched in enhancers in particular cell types, and ABC outperforms other regulatory methods at connecting noncoding variants to target genes. Across 72 diseases and complex traits, ABC links 5,036 GWAS signals to 2,249 unique genes, including a class of 577 genes that appear to influence multiple phenotypes via variants in enhancers that act in different cell types. Guided by these variant-to-function maps, we show that an enhancer containing an IBD risk variant regulates the expression of PPIF to tune mitochondrial membrane potential. Together, our study reveals insights into principles of genome regulation, illuminates mechanisms that influence IBD, and demonstrates a generalizable strategy to connect common disease risk variants to their molecular and cellular functions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.