The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.
Huntington's disease (HD) is one of an increasing number of neurodegenerative disorders caused by a CAG/polyglutamine repeat expansion. Mice have been generated that are transgenic for the 5' end of the human HD gene carrying (CAG)115-(CAG)150 repeat expansions. In three lines, the transgene is ubiquitously expressed at both mRNA and protein level. Transgenic mice exhibit a progressive neurological phenotype that exhibits many of the features of HD, including choreiform-like movements, involuntary stereotypic movements, tremor, and epileptic seizures, as well as nonmovement disorder components. This transgenic model will greatly assist in an eventual understanding of the molecular pathology of HD and may open the way to the testing of intervention strategies.
Summary
Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.
Protein-protein interaction maps provide a valuable framework for a better understanding of the functional organization of the proteome. To detect interacting pairs of human proteins systematically, a protein matrix of 4456 baits and 5632 preys was screened by automated yeast two-hybrid (Y2H) interaction mating. We identified 3186 mostly novel interactions among 1705 proteins, resulting in a large, highly connected network. Independent pull-down and co-immunoprecipitation assays validated the overall quality of the Y2H interactions. Using topological and GO criteria, a scoring system was developed to define 911 high-confidence interactions among 401 proteins. Furthermore, the network was searched for interactions linking uncharacterized gene products and human disease proteins to regulatory cellular pathways. Two novel Axin-1 interactions were validated experimentally, characterizing ANP32A and CRMP1 as modulators of Wnt signaling. Systematic human protein interaction screens can lead to a more comprehensive understanding of protein function and cellular processes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.