Comparison of related genomes has emerged as a powerful lens for genome interpretation. Here, we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and report constrained elements covering ~4.2% of the genome. We use evolutionary signatures and comparison with experimental datasets to suggest candidate functions for ~60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events, and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements, and ~1,000 primate- and human-accelerated elements. Overlap with disease-associated variants suggests our findings will be relevant for studies of human biology and health.
Despite the conventional distinction between them, promoters and enhancers share many features in mammals, including divergent transcription and similar modes of transcription factor binding. Here, we examine the architecture of transcription initiation through comprehensive mapping of transcription start sites (TSSs) in human lymphoblastoid B-cell (GM12878) and chronic myelogenous leukemic (K562) tier 1, ENCODE cell lines. Using a nuclear run-on protocol called GRO-cap, which captures TSSs for both stable and unstable transcripts, we conduct detailed comparisons of thousands of promoters and enhancers in human cells. These analyses reveal a common architecture of initiation, including tightly spaced (110 bp) divergent initiation, similar frequencies of core-promoter sequence elements, highly positioned flanking nucleosomes, and two modes of transcription factor binding. Post-initiation transcript stability provides a more fundamental distinction between promoters and enhancers than patterns of histone modifications, transcription factors or co-activators. These results support a unified model of transcription initiation at promoters and enhancers.
“Orangutan” is derived from the Malay term “man of the forest” and aptly describes the Southeast Asian great apes native to Sumatra and Borneo. The orangutan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orangutan draft genome assembly and short read sequence data from five Sumatran and five Bornean orangutan genomes. Our analyses reveal that, compared to other primates, the orangutan genome has many unique features. Structural evolution of the orangutan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe the first primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orangutan genome structure. Orangutans have extremely low energy usage for a eutherian mammal1, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400k years ago (ya), is more recent than most previous studies and underscores the complexity of the orangutan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (Ne) expanded exponentially relative to the ancestral Ne after the split, while Bornean Ne declined over the same period. Overall, the resources and analyses presented here offer new opportunities in evolutionary genomics, insights into hominid biology, and an extensive database of variation for conservation efforts.
Summary RNA polymerase II (Pol II) transcribes hundreds of kilobases of DNA, limiting the production of mRNAs and lncRNAs. We used Global Run-on Sequencing (GRO-seq) to measure the rates of transcription by Pol II following gene activation. Elongation rates vary as much as 4-fold at different genomic loci and in response to two distinct cellular signaling pathways [i.e., 17β-estradiol (E2) and TNFα]. The rates are slowest near the promoter and increase during the first ~15 kb transcribed. Gene body elongation rates correlate with Pol II density, resulting in systematically higher rates of transcript production at genes with higher Pol II density. Pol II dynamics following short inductions indicate that E2 stimulates gene expression by increasing Pol II initiation, whereas TNFα reduces Pol II residence time at pause sites. Collectively, our results identify previously uncharacterized variation in the rate of transcription and highlight elongation as an important, variable, and regulated rate-limiting step during transcription.
Transcriptional regulatory elements (TREs), including enhancers and promoters, determine the transcription levels of associated genes. We have recently shown that global run-on and sequencing (GRO-seq) with enrichment for 5'-capped RNAs reveals active TREs with high accuracy. Here, we demonstrate that active TREs can be identified by applying sensitive machine-learning methods to standard GRO-seq data. This approach allows TREs to be assayed together with gene expression levels and other transcriptional features in a single experiment. Our prediction method, called discriminative Regulatory Element detection from GRO-seq (dREG), summarizes GRO-seq read counts at multiple scales and uses support vector regression to identify active TREs. The predicted TREs are more strongly enriched for several marks of transcriptional activation, including eQTL, GWAS-associated SNPs, H3K27ac, and transcription factor binding than those identified by alternative functional assays. Using dREG, we survey TREs in eight human cell types and provide new insights into global patterns of TRE function.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.