Recent sequencing technologies enable joint quantification of promoters and their enhancer regions, allowing inference of enhancer–promoter links. We show that current enhancer–promoter inference methods produce a high rate of false positive links. We introduce FOCS, a new inference method, and by benchmarking against ChIA-PET, HiChIP, and eQTL data show that it results in lower false discovery rates and at the same time higher inference power. By applying FOCS to 2630 samples taken from ENCODE, Roadmap Epigenomics, FANTOM5, and a new compendium of GRO-seq samples, we provide extensive enhancer–promotor maps (http://acgt.cs.tau.ac.il/focs). We illustrate the usability of our maps for deriving biological hypotheses.Electronic supplementary materialThe online version of this article (10.1186/s13059-018-1432-2) contains supplementary material, which is available to authorized users.
Given our recent discovery of somatic mutations in autism spectrum disorder (ASD)/intellectual disability (ID) genes in postmortem aged Alzheimer's disease brains correlating with increasing tauopathy, it is important to decipher if tauopathy is underlying brain imaging results of atrophy in ASD/ID children. We concentrated on activity-dependent neuroprotective protein (ADNP), a prevalent autism gene. The unique availability of multiple postmortem brain sections of a 7-year-old male, heterozygous for ADNP de novo mutation c.2244Adup/p.His559Glnfs*3 allowed exploration of tauopathy, reflecting on a general unexplored mechanism. The tested subject exhibited autism, fine motor delays, severe intellectual disability and seizures. The patient died after multiple organ failure following liver transplantation. To compare to other ADNP syndrome mutations, immortalized lymphoblastoid cell lines from three different patients (including ADNP p.Arg216*, p.Lys408Valfs*31, and p.Tyr719* heterozygous dominant mutations) and a control were subjected to RNA-seq. Immunohistochemistry, high-throughput gene expression profiles in numerous postmortem tissues followed. Comparisons to a control brain and to extensive datasets were used. Live cell imaging investigated Tau-microtubule interaction, protecting against tauopathy. Extensive child brain tauopathy paralleled by multiple gene expression changes was discovered. Tauopathy was explained by direct mutation effects on Tau-microtubule interaction and correction by the ADNP active snippet NAP. Significant pathway changes (empirical P value < 0.05) included over 100 genes encompassing neuroactive ligand-receptor and cytokine-cytokine receptor interaction, MAPK and calcium signaling, axon guidance and Wnt signaling pathways. Changes were also seen in steroid biosynthesis genes, suggesting sex differences. Selecting the most affected genes by the ADNP mutations for gene expression analysis, in multiple postmortem tissues, identified Tau (MAPT)-gene-related expression changes compared with extensive normal gene expression (RNA-seq) databases. ADNP showed relatively reduced expression in the ADNP syndrome cerebellum, which was also observed for 25 additional genes (representing >50% of the tested genes), including NLGN1, NLGN2, PAX6, SMARCA4, and SNAP25, converging on nervous system development and tauopathy. NAP provided protection against mutated ADNP disrupted Tau-microtubule association. In conclusion, tauopathy may explain brain-imaging findings in ADNP syndrome children and may provide a new direction for the development of tauopathy protecting drug candidates like NAP in ASD/ID.
Genome-wide expression profiling has revolutionized biomedical research; vast amounts of expression data from numerous studies of many diseases are now available. Making the best use of this resource in order to better understand disease processes and treatment remains an open challenge. In particular, disease biomarkers detected in case–control studies suffer from low reliability and are only weakly reproducible. Here, we present a systematic integrative analysis methodology to overcome these shortcomings. We assembled and manually curated more than 14 000 expression profiles spanning 48 diseases and 18 expression platforms. We show that when studying a particular disease, judicious utilization of profiles from other diseases and information on disease hierarchy improves classification quality, avoids overoptimistic evaluation of that quality, and enhances disease-specific biomarker discovery. This approach yielded specific biomarkers for 24 of the analyzed diseases. We demonstrate how to combine these biomarkers with large-scale interaction, mutation and drug target data, forming a highly valuable disease summary that suggests novel directions in disease understanding and drug repurposing. Our analysis also estimates the number of samples required to reach a desired level of biomarker stability. This methodology can greatly improve the exploitation of the mountain of expression profiles for better disease analysis.
Spatiotemporal gene expression patterns are governed to a large extent by enhancer elements, typically located distally from their target genes. Identification of enhancerpromoter (EP) links that are specific and functional in individual cell types is a key challenge in understanding gene regulation. We introduce CT-FOCS, a new statistical inference method that utilizes multiple replicates per cell type to infer cell type-specific EP links. Computationally predicted EP links are usually benchmarked against experimentally determined chromatin interaction measured by ChIA-PET. We expand this validation scheme by introducing the concept of connected loop set, which combines loops that overlap in their anchor sites. Analzying 1,366 samples from ENCODE, Roadmap epigenomics and FANTOM5, CT-FOCS inferred highly cell type-specific EP links more accurately than a state-of-the-art method. We illustrate how our inferred EP links drive cell type-specific gene expression and regulation.TargetFinder [13]. All these methods rely on data of multiple chromatin marks and expression data for the studied cell types.The JEME algorithm finds global and cell type-active EP links (but not necessarily cell typespecific) using only 1-5 different omics data types [14]. Each reported EP link is given a score denoting tendency to be active in a given cell type. JEME reports an average of 4,095 active EP links per cell type, and most of these may be nonspecific.Several recent studies aimed at finding ct-links experimentally. Rajarajan et al. [15] used insitu HiC and schizophrenia risk locus to identify 1,702 and 442 neuronal progenitor cell (NPC) specific and neuron specific 3D chromatin interactions for 386 and 385 genes, respectively. Some of the NPC and neuron-specific interactions may be enhancer-promoter interactions (or ct-links). Gasperini et al. [16] used CRISPR screening to perturb 5,920 human candidate enhancers that may affect gene expression at the single-cell level in combination with eQTL analysis, and identified 664 EP links covering 479 genes enriched with K562-specific genes and lineage-specific transcription factors (TFs; reviewed in [17]). Remarkably, both studies reported far fewer links than JEME, indicating that only a small portion of EP links that are active in a cell type are specific for it.Here, we introduce CT-FOCS (Cell Type FDR-corrected OLS with Cross-validation and Shrinkage), a novel method for inferring ct-links from large-scale compendia of hundreds of cell types measured by a single omic technique (e.g., DNase Hypersensitive Sites sequencing; DHS-seq). It is built upon our previously published method, FOCS [18], which infers global EP links that show high correlation between the enhancer and the promoter activity patterns across many samples. Given the omic profile for a set of cell types, each one with replicates, CT-FOCS uses linear mixed effect models (LMMs) to infer ct-links. CT-FOCS was applied on public DNase Hypersensitive Sites (DHS) profiles from ENCODE and Roadmap Epigenomics [19][20][21], and ca...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.