Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly ‘housekeeping’, whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.
Although it is generally accepted that cellular differentiation requires changes to transcriptional networks, dynamic regulation of promoters and enhancers at specific sets of genes has not been previously studied en masse. Exploiting the fact that active promoters and enhancers are transcribed, we simultaneously measured their activity in 19 human and 14 mouse time courses covering a wide range of cell types and biological stimuli. Enhancer RNAs, then messenger RNAs encoding transcription factors, dominated the earliest responses. Binding sites for key lineage transcription factors were simultaneously overrepresented in enhancers and promoters active in each cellular system. Our data support a highly generalizable model in which enhancer transcription is the earliest event in successive waves of transcriptional change during cellular differentiation or activation.
Large-scale sequencing projects have revealed an unexpected complexity in the origins, structures and functions of mammalian transcripts. Many loci are known to produce overlapping coding and non-coding RNAs with capped 5′ ends that vary in size. Methods that identify the 5′ ends of transcripts will facilitate the discovery of novel promoters and 5′ ends derived from secondary capping events. Such methods often require high input amounts of RNA not obtainable from highly refined samples such as tissue microdissections and subcellular fractions. Therefore, we have developed nanoCAGE (Cap Analysis of Gene Expression), a method that captures the 5′ ends of transcripts from as little as 10 nanograms of total RNA and CAGEscan, a mate-pair adaptation of nanoCAGE that captures the transcript 5′ ends linked to a downstream region. Both of these methods allow further annotation-agnostic studies of the complex human transcriptome.
BackgroundEvolutionarily conserved RFX transcription factors (TFs) regulate their target genes through a DNA sequence motif called the X-box. Thereby they regulate cellular specialization and terminal differentiation. Here, we provide a comprehensive analysis of all the eight human RFX genes (RFX1–8), their spatial and temporal expression profiles, potential upstream regulators and target genes.ResultsWe extracted all known human RFX1–8 gene expression profiles from the FANTOM5 database derived from transcription start site (TSS) activity as captured by Cap Analysis of Gene Expression (CAGE) technology. RFX genes are broadly (RFX1–3, RFX5, RFX7) and specifically (RFX4, RFX6) expressed in different cell types, with high expression in four organ systems: immune system, gastrointestinal tract, reproductive system and nervous system. Tissue type specific expression profiles link defined RFX family members with the target gene batteries they regulate. We experimentally confirmed novel TSS locations and characterized the previously undescribed RFX8 to be lowly expressed. RFX tissue and cell type specificity arises mainly from differences in TSS architecture. RFX transcript isoforms lacking a DNA binding domain (DBD) open up new possibilities for combinatorial target gene regulation. Our results favor a new grouping of the RFX family based on protein domain composition. We uncovered and experimentally confirmed the TFs SP2 and ESR1 as upstream regulators of specific RFX genes. Using TF binding profiles from the JASPAR database, we determined relevant patterns of X-box motif positioning with respect to gene TSS locations of human RFX target genes.ConclusionsThe wealth of data we provide will serve as the basis for precisely determining the roles RFX TFs play in human development and disease.Electronic supplementary materialThe online version of this article (10.1186/s12864-018-4564-6) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.