The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
Summary Structural variants (SVs) are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight SV classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype-blocks in 26 human populations. Analyzing this set, we identify numerous gene-intersecting SVs exhibiting population stratification and describe naturally occurring homozygous gene knockouts suggesting the dispensability of a variety of human genes. We demonstrate that SVs are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of SV complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex SVs with multiple breakpoints likely formed through individual mutational events. Our catalog will enhance future studies into SV demography, functional impact and disease association.
We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
New types of small RNAs distinct from microRNAs (miRNAs) are progressively being discovered in various organisms. In order to discover such novel small RNAs, a library of 17-to 26-base-long RNAs was created from prostate cancer cell lines and sequenced by ultra-high-throughput sequencing. A significant number of the sequences are derived from precise processing at the 59 or 39 end of mature or precursor tRNAs to form three series of tRFs (tRNA-derived RNA fragments): the tRF-5, tRF-3, and tRF-1 series. These sequences constitute a class of short RNAs that are second most abundant to miRNAs. Northern hybridization, quantitative RT-PCR, and splinted ligation assays independently measured the levels of at least 17 tRFs. To demonstrate the biological importance of tRFs, we further investigated tRF-1001, derived from the 39 end of a Ser-TGA tRNA precursor transcript that is not retained in the mature tRNA. tRF-1001 is expressed highly in a wide range of cancer cell lines but much less in tissues, and its expression in cell lines was tightly correlated with cell proliferation. siRNAmediated knockdown of tRF-1001 impaired cell proliferation with the specific accumulation of cells in G2, phenotypes that were reversed specifically by cointroducing a synthetic 29-O-methyl tRF-1001 oligoribonucleotide resistant to the siRNA. tRF-1001 is generated in the cytoplasm by tRNA 39-endonuclease ELAC2, a prostate cancer susceptibility gene. Our data suggest that tRFs are not random by-products of tRNA degradation or biogenesis, but an abundant and novel class of short RNAs with precise sequence structure that have specific expression patterns and specific biological roles.[Keywords: Small RNA; tRNA; deep sequencing; cancer cell proliferation] Supplemental material is available at http://www.genesdev.org.
Three muscle-specific microRNAs, miR-206, -1, and -133, are induced during differentiation of C2C12 myoblasts in vitro. Transfection of miR-206 promotes differentiation despite the presence of serum, whereas inhibition of the microRNA by antisense oligonucleotide inhibits cell cycle withdrawal and differentiation, which are normally induced by serum deprivation. Among the many mRNAs that are down-regulated by miR-206, the p180 subunit of DNA polymerase α and three other genes are shown to be direct targets. Down-regulation of the polymerase inhibits DNA synthesis, an important component of the differentiation program. The direct targets are decreased by mRNA cleavage that is dependent on predicted microRNA target sites. Unlike small interfering RNA–directed cleavage, however, the 5′ ends of the cleavage fragments are distributed and not confined to the target sites, suggesting involvement of exonucleases in the degradation process. In addition, inhibitors of myogenic transcription factors, Id1-3 and MyoR, are decreased upon miR-206 introduction, suggesting the presence of additional mechanisms by which microRNAs enforce the differentiation program.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.